Dear friends,
Internalizing this mental framework has made me a more efficient machine learning engineer: Most of the work of building a machine learning system is debugging rather than development.
Machine learning software often has to carry out a sequence of steps; such systems are called pipelines or cascades. Say, you want to build a system to route an ecommerce site’s customer emails to the appropriate department (is this apparel, electronics, . . . ), then retrieve relevant product information using semantic search, and finally draft a response for a human representative to edit. Each of these steps could have been done by a human. By examining them individually and seeing where the system falls short of human-level performance, you can decide where to focus your attention. While debugging a system, I frequently have a “hmm, that looks strange” moment that suggests what to try next. For example, I’ve experienced each of the following many times:
When it comes to noticing things like this, experience working with multiple projects is helpful. Machine learning systems have a lot of moving parts. When you have seen many learning curves, you start to hone your instincts about what’s normal and what’s anomalous; or when you have prompted a large language model (LLM) to output JSON many times, you start to get a sense of the most common error modes. These days, I frequently play with building different small LLM-based applications on weekends just for fun. Seeing how they behave (as well as consulting with friends on their projects) is helping me to hone my own instincts about when such applications go wrong, and what are plausible solutions.
The notion that much of machine learning development is akin to debugging arises from this observation: When we start a new machine learning project, we don’t know what strange and wonderful things we’ll find in the data. With prompt-based development, we also don’t know what strange and wonderful things a generative model will produce. This is why machine learning development is much more iterative than traditional software development: We’re embarking on a journey to discover these things. Building a system quickly and then spending most of your time debugging it is a practical way to get such systems working.
Keep learning! Andrew
DeepLearning.AI ExclusiveRobert Monarch, the instructor of our new specialization AI for Good, spoke with us about how AI is being applied to social and environmental challenges and how you can join the growing AI for Good movement. Read the interview
NewsStable BiasesStable Diffusion may amplify biases in its training data in ways that promote deeply ingrained social stereotypes. What's new: The popular text-to-image generator from Stability.ai tends to underrepresent women in images of prestigious occupations and overrepresent darker-skinned people in images of low-wage workers and criminals, Bloomberg reported. How it works: Stable Diffusion was pretrained on five billion text-image pairs scraped from the web. The reporters prompted the model to generate 300 face images each of workers in 14 professions, seven of them stereotypically “high-paying” (such as lawyer, doctor, and engineer) and seven considered “low-paying” (such as janitor, fast-food worker, and teacher). They also generated images for three negative keywords: “inmate,” “drug dealer,” and “terrorist.” They analyzed the skin color and gender of the resulting images.
Results: Stable Diffusion’s output aligned with social stereotypes but not with real-world data.
Behind the news: Image generators have been found to reproduce and often amplify biases in their training data.
Why it matters: Not long ago, the fact that image generators reflect and possibly amplify biases in their training data was mostly academic. Now, because a variety of software products integrate them, such biases can leach into products as diverse as video games, marketing copy, and law-enforcement profiles. We're thinking: While it’s important to minimize bias in our datasets and trained models, it’s equally important to use our models in ways that support fairness and justice. For instance, a judge who weighs individual factors in decisions about how to punish a wrongdoer may be better qualified to decide than a model that simply reflects demographic trends in criminal justice.
AI & Banking: Progress ReportOne bank towers above the competition when it comes to AI, a recent study suggests. What’s new: A report from market research firm Evident Insights measures use of AI by the banking industry.
Results: JPMorgan Chase excelled in all four categories with a combined score of 62.6 out of 100. The next-highest scorers were Royal Bank of Canada (41.4) and Citigroup (39.0). The authors credited JPMorgan Chase with successful long-term investments in AI research coupled with an openness to letting AI talent publish academic work. Other highlights:
Behind the news: A growing number of banks are taking advantage of generative AI.
Why it matters: Finance is among the few industries outside tech that can afford to hire large teams of top AI talent. It’s also a data-heavy industry where applications — fraud detection, financial forecasting, and reconciling and closing accounts — can bring a ready payoff. The combination has made banking a hotbed for AI talent.
A MESSAGE FROM WORKERAIntroducing Skills AI, the tool that harnesses large language models for managers to develop, upskill, and retain teams at the cutting edge of competency. Join the waitlist
Language Models’ Impact on JobsTelemarketers and college professors are most likely to find their jobs changing due to advances in language modeling, according to a new study.
Results: The authors concluded that telemarketing was most exposed to impact by language models. Among the 20 occupations with the greatest exposure, 14 were post-secondary teaching roles including university-level teachers of language, history, law, and philosophy. The top 20 also included sociologists, political scientists, arbitrators, judges, and psychologists. Among industries, the authors found that legal services were most exposed. Of the 20 industries with the greatest exposure, 11 involved finance including securities, insurance, and accounting. Behind the news: The authors adapted their method from a 2021 study that scored each occupation’s and each industry’s exposure to AI areas defined by the Electronic Frontier Foundation, including game playing, computer vision, image generation, translation, and music recognition. The previous study found that the most exposed jobs were genetic counselors, financial examiners, and actuaries. The most exposed industries were financial securities, accounting, and insurance. We’re thinking: As the authors note, an occupation’s exposure to AI does not necessarily put jobs at risk. History suggests the opposite can happen. A 2022 study found that occupations exposed to automation saw increases in employment between 2008 and 2018. Several other studies found that countries with high levels of automation also tend to have high overall levels of employment.
Sample-Efficient Training for RobotsTraining an agent that controls a robot arm to perform a task — say, opening a door — that involves a sequence of motions (reach, grasp, turn, pull, release) can take from tens of thousands to millions of examples. A new approach pretrained an agent on many tasks for which lots of data was available, so it needed dramatically fewer examples to learn related tasks. What’s new: Joey Hejna and Dorsa Sadigh at Stanford used a variation on reinforcement learning from human feedback (RLHF) to train an agent to perform a variety of tasks in simulation. The team didn’t handcraft the reward functions. Instead, neural networks learned them. RLHF basics: A popular approach to tuning large language models, RLHF follows four steps: (1) Pretrain a generative model. (2) Use the model to generate data and have humans assign a score to each output. (3) Given the scored data, train a model — called the reward model — to mimic the way humans assigned scores. Higher scores are tantamount to higher rewards. (4) Use scores produced by the reward model to fine-tune the generative model, via reinforcement learning, to produce high-scoring outputs. In short, a generative model produces an example, a reward model scores it, and the generative model learns based on that score. Key insight: Machine-generated data is cheap, while human-annotated data is expensive. So, if you’re building a neural network to estimate rewards for several tasks that involve similar sequences of motions, it makes sense to pretrain it for a set of tasks using a large quantity of machine-generated data, and then fine-tune a separate copy for each task to be performed using small amounts of human-annotated data.
How it works: The authors trained an RL agent to perform 10 simulated tasks from Meta-World such as pushing a block, opening a door, and closing a drawer. For each task, they fine-tuned a separate pretrained vanilla neural network to calculate rewards used in training the agent.
Results: Trained to open a window, the agent achieved 100 percent success after fine-tuning on 64 human-annotated motion sequences. Trained to close a door, it achieved 95 percent success with 100 human-annotated motion sequences. In contrast, using the same number of examples, PEBBLE, another RL method that involves human feedback, achieved 10 percent and 75 percent success respectively. Fed machine-generated examples rather than human feedback, the agent achieved 100 percent success on all Meta-World tasks except pressing a button after fine-tuning on 2,500 examples — 20 times fewer than PEBBLE required to achieve the same performance. Why it matters: OpenAI famously fine-tuned ChatGPT using RLHF, which yielded higher-quality, safer output. Now this powerful technique can be applied to robotics. We’re thinking: Pretraining followed by fine-tuning opens the door to building AI systems that can learn new tasks from very little data. It's exciting to see this idea applied to building more capable robots.
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and LandingAI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|