Dear friends,
Inexpensive token generation and agentic workflows for large language models (LLMs) open up intriguing new possibilities for training LLMs on synthetic data. Pretraining an LLM on its own directly generated responses to prompts doesn't help. But if an agentic workflow implemented with the LLM results in higher quality output than the LLM can generate directly, then training on that output becomes potentially useful.
However, an LLM wrapped in an agentic workflow may produce higher-quality output than it can generate directly. In this case, the LLM’s higher-quality output might be useful as pretraining data for the LLM itself.
Efforts like these have precedents:
A significant barrier to using LLMs prompted via agentic workflows to produce their own training data is the cost of generating tokens. Say we want to generate 1 trillion tokens to extend a pre-existing training dataset. Currently, at publicly announced prices, generating 1 trillion tokens using GPT-4-turbo ($30 per million output tokens), Claude 3 Opus ($75), Gemini 1.5 Pro ($21), and Llama-3-70B on Groq ($0.79) would cost, respectively, $30M, $75M, $21M and $790K. Of course, an agentic workflow that uses a design pattern like Reflection would require generating more than one token per token that we would use as training data. But budgets for training cutting-edge LLMs easily surpass $100M, so spending a few million dollars more for data to boost performance is quite feasible.
That’s why I believe agentic workflows will open up intriguing new opportunities for high-quality synthetic data generation.
Keep learning! Andrew
P.S. In “Prompt Engineering for Vision Models,” taught by Abby Morgan, Jacques Verré, and Caleb Kaiser of Comet, you’ll learn how to prompt and fine-tune a variety of vision models for image generation, image editing, object detection, and segmentation. For example, you’ll use OWL-ViT to detect an object you describe in a text prompt, pass the bounding box to SAM to create a segmentation mask, and feed the mask into Stable Diffusion with a text prompt to replace the original object with a new one. Controlling vision models can be tricky, and this course will teach you the techniques to control their output. Get started here!
NewsThink Different SmallApple is thinking small — very small — with a new family of open large language models. What's new: Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, and colleagues at Apple released Open Source Efficient LLM (OpenELM), a family of smaller large language models. OpenELM ranges from 270 million parameters — plenty small enough to fit on a phone — to 3 billion parameters. How it works: OpenELM comes in pretrained and instruction-tuned versions with parameter counts of 270 million, 450 million, 1.1 billion, and 3 billion. They can process 2,048 tokens of context. The release includes weights, code for training and inference, and code for running the models on Apple chips.
Results: OpenELM beat a number of other open-source models trained solely on publicly available data.
Why it matters: After years of becoming only larger, neural networks lately have also been getting smaller. The smallest OpenELMs are tiny compared to, say, Microsoft’s Phi-3-mini. Apple has an extra incentive to make models capable of running on edge devices like phones. The company makes a major selling point of user privacy, and models run entirely on a smartphone (as opposed to in the cloud) keep the user’s activity under wraps. We're thinking: DeLighT introduced this layer-scaling approach in 2020. Sometimes it takes a while for good ideas to catch on!
AI Trends in DepthMore expensive models, superhuman performance, growing impacts on society — an extensive report takes stock of developments in machine learning over the past year. What's new: Stanford’s Institute for Human-Centric AI published the seventh “AI Index Report,” its annual overview of the state of AI. The report documents rising costs and capabilities, a shift from academic to corporate dominance, and the public’s anxiety as the technology becomes ever more embedded in daily life. Themes and findings: The 500-page report collates a wide variety of papers, benchmarks, market research, and surveys published in 2023. It delves deeply into AI technology, economics, governance, and impact. Among its key conclusions:
Behind the news: The differences between the new one and the initial, 2018 edition highlight the field’s rapid pace of change. For instance, the 2018 report opened by trumpeting the nearly 9x growth of AI research papers published between 2000 and 2017. The new one opened not with the annual rate of research publications (though it has roughly doubled since 2017) but with a graph of industry’s growing dominance in innovation. The Batch has covered several editions. Why it matters: The “AI Index Report” offers a detailed snapshot of AI as it advances at an unprecedented rate and shows potential to revolutionize virtually every field of human endeavor. It dives deeply into areas of special concern to researchers (such as Gemini’s nearly $200 million training cost), practitioners (for instance, the slightly narrowing gender gap among computer science PhDs), businesses (the sharply rising number of regulations), and users (half of those who are aware of ChatGPT use it weekly). This year’s report includes new emphases on public opinion and geopolitics. We're thinking: It’s heartening to see AI thriving. The field faces daunting challenges, yet the report highlights achievements in foundation models, science, medicine, and elsewhere that portend greater benefits directly ahead. What an exciting time for AI!
NEW FROM DEEPLEARNING.AIExpand your prompting skills with our new short course, “Prompt Engineering for Vision Models.” Learn how to prompt and fine-tune vision models to accomplish tasks from image generation to object detection. Start learning today
Amazon Rethinks Cashier-Free StoresAmazon is removing grab-and-go shopping from its cart. What’s new: Amazon withdrew Just Walk Out, an AI-driven checkout service, from most of its Amazon Fresh grocery stores, The Information reported. Instead, the stores will provide smart shopping carts. (Disclosure: Andrew Ng is a member of Amazon’s Board of Directors.)
Behind the news: Amazon introduced Just Walk Out in 2016 at its first Amazon Go convenience store in Seattle. It extended the system to Amazon Fresh in 2020. Between September 2020 and September 2022, Amazon opened 44 Fresh stores in the U.S. and 19 in the UK, most of which included Just Walk Out. But Amazon’s brick-and-mortar locations suffered during the COVID-19 pandemic. From September 2022 to mid-2024, amid broader cost-cutting efforts, the company paused opening new grocery stores. Why it matters: Grab-and-go shopping seems like a solid bet, given the increasing focus of retailing on immediate gratification. Yet Amazon’s retreat from Just Walk Out in larger stores suggests that the technology is less well suited to such environments. In addition, shoppers may not have adjusted easily to grab-and-go behavior, which removes social interactions with cashiers and encourages customers to spend without reviewing the bill. We’re thinking: AI has the potential to revolutionize every field, including retailing, and it’s important to find productive uses for it. Not all experiments will succeed, but patient investment and experimentation can illuminate productive paths forward.
Predicting Scientific DiscoveriesA new AI method directs scientists toward promising avenues of inquiry. What's new: Jamshid Sourati and James A. Evans at University of Chicago proposed a method to predict new scientific discoveries by building a graph that connects researchers, their objects of study, and the scientific properties thereof. They evaluated their approach using data from materials science. Key insight: Overlapping interests among researchers may indicate areas where further research would be fruitful. For example, if one group of researchers studies a material A and its property P, a second group studies materials A and B, and another group studies materials B and C, it may turn out that material C exhibits property P. How it works: The authors tried to predict whether certain inorganic materials have certain electrical properties based on scientific literature through the year 2000. From 1.5 million articles that described 100,000 inorganic compounds, they extracted the author names, materials mentioned (for example, sodium nitrite), and their properties (for example, thermoelectricity, the ability to convert heat into electricity and vice versa). They used this data to construct a graph whose nodes were authors, materials, and properties. Edges connected the nodes that appeared in the same paper, for example a particular author whose paper covered specific material or property.
Results: The authors predicted which materials possessed each of three properties. They compared their results with predictions obtained in a similar way using a Word2Vec model trained exclusively on text from scientific papers. They used papers from 2001 through 2018 to evaluate the predictions. For thermoelectricity, the cumulative precision (percentage of predicted discoveries proven correct) was 76 percent, while the cumulative precision of the alternative method was 48 percent. The cumulative precision of random guesses was about 3 percent. The authors obtained similar results for the other two properties. Why it matters: Science is a social endeavor, where the connections between people and their work can be represented as a graph that reflects the collective attention of the scientific community. The collective attention acts as a signal that predicts promising avenues for further research — a signal that machine learning can help to tease out. We're thinking: The authors also predicted drug discoveries with similarly good results. Their method may be useful for identifying fruitful directions in other scientific areas, and perhaps in other domains entirely.
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|