Dear friends,
One reason for machine learning’s success is that our field welcomes a wide range of work. I can’t think of even one example where someone developed what they called a machine learning algorithm and senior members of our community criticized it saying, “that’s not machine learning!” Indeed, linear regression using a least-squares cost function was used by mathematicians Legendre and Gauss in the early 1800s — long before the invention of computers — yet machine learning has embraced these algorithms, and we routinely call them “machine learning” in introductory courses!
In contrast, about 20 years ago, I saw statistics departments at a number of universities look at developments in machine learning and say, “that’s not really statistics.” This is one reason why machine learning grew much more in computer science than statistics departments. (Fortunately, since then, most statistics departments have become much more open to machine learning.)
This contrast came to mind a few months ago, as I thought about how to talk about agentic systems that use design patterns such as reflection, tool use, planning, and multi-agent collaboration t0 produce better results than zero-shot prompting. I had been involved in conversations about whether certain systems should count as “agents.” Rather than having to choose whether or not something is an agent in a binary way, I thought, it would be more useful to think of systems as being agent-like to different degrees. Unlike the noun “agent,” the adjective “agentic” allows us to contemplate such systems and include all of them in this growing movement.
More and more people are building systems that prompt a large language model multiple times using agent-like design patterns. But there’s a gray zone between what clearly is not an agent (prompting a model once) and what clearly is (say, an autonomous agent that, given high-level instructions, plans, uses tools, and carries out multiple, iterative steps of processing). Rather than arguing over which work to include or exclude as being a true agent, we can acknowledge that there are different degrees to which systems can be agentic. Then we can more easily include everyone who wants to work on agentic systems. We can also encourage newcomers to start by building simple agentic workflows and iteratively make their systems more sophisticated.
In the past few weeks, I’ve noticed that, while technical people and non-technical people alike sometimes use the word “agent,” mainly only technical people use the word “agentic” (for now!). So when I see an article that talks about “agentic” workflows, I’m more likely to read it, since it’s less likely to be marketing fluff and more likely to have been written by someone who understands the technology.
Let’s keep working on agentic systems and keep welcoming anyone who wants to join our field!
Keep learning, Andrew
A MESSAGE FROM DEEPLEARNING.AIGrow your generative AI skills with DeepLearning.AI’s short courses! Learn how to build highly controllable agents in “AI Agents in LangGraph.” Enroll for free and get started
NewsApple’s Gen AI Strategy RevealedApple presented its plan to imbue its phones and computers with artificial intelligence.
How it works: Apple outlined the architecture that underpins the new features and compared two models of its against competitors.
Behind the news: While rivals like Microsoft and Google dove into generative AI, Apple moved more cautiously. During the 2010s, it invested heavily in its Siri voice assistant, but the technology was outpaced by subsequent developments. Since then, the famously secretive company has been perceived as falling behind big-tech rivals in AI.
Audio Generation Clear of CopyrightsSonically minded developers gained a high-profile text-to-audio generator. What’s new: Stability AI released Stable Audio Open, which takes text prompts and generates 16kHz-resolution music or sound effects. The model’s code and weights are available for noncommercial use. You can listen to a few sample outputs here.
Behind the news: Stable Audio Open competes not only with Stable Audio 2.0 but also with a handful of recent models. ElevenLabs, known for voice cloning and generation, introduced Sound Effects, which generates brief sound effects from a text prompt. Users can input up to 10,000 prompt characters with a free account. For music generation, Udio and Suno offer web-based systems that take text prompts and generate structured compositions including songs with lyrics, voices, and full instrumentation. Users can generate a handful of compositions daily for free. Why it matters: Stable Audio Open is pretrained on both music and sound effects, and it can be fine-tuned and otherwise modified. The fact that its training data was copyright-free guarantees that users won’t make use of proprietary sounds — a suitable option for those who prefer to steer clear of the music industry’s brewing intellectual property disputes.
Seoul AI Summit Spurs Safety AgreementsAt meetings in Seoul, government and corporate officials from dozens of countries agreed to take action on AI safety. International commitments: Government officials hammered out frameworks for promoting innovation while managing risk.
Corporate commitments: AI companies agreed to monitor their own work and collaborate on further measures.
Behind the news: Co-hosted by the UK and South Korean governments at the Korea Advanced Institute of Science and Technology, the meeting followed an initial summit held at Bletchley Park outside London in November. The earlier summit facilitated agreements to create AI safety institutes, test AI products before public release, and create an international panel akin to the Intergovernmental Panel on Climate Change to draft reports on the state of AI. The panel published an interim report in May. It will release its final report at the next summit in Paris in November 2024. Why it matters: There was a chance that the Bletchley Park summit would be a one-off. The fact that a second meeting occurred is a sign that public and private interests alike want at least a seat at the table in discussions of AI safety. Much work remains to define terms and establish protocols, but plans for future summits indicate a clear appetite for further cooperation. We’re thinking: Andrew Ng spoke at the AI Global Forum on the importance of regulating applications rather than technology and chatted with many government leaders there. Discussions focused at least as much on promoting innovation as mitigating hypothetical risks. While some large companies continued to lobby for safety measures that would unnecessarily impede dissemination of cutting-edge foundation models and hamper open-source and smaller competitors, most government leaders seemed to give little credence to science-fiction risks, such as AI takeover, and express concern about concrete, harmful applications like the use of AI to interfere with democratic elections. These are encouraging shifts!
The LLM Will See You NowA critical step in diagnosing illnesses is a conversation between doctor and patient to assemble a medical history, discuss approaches to managing symptoms, and so on. Can a large language model play the doctor’s role? Researchers trained one to do surprisingly well. What's new: Articulate Medical Intelligence Explorer (AMIE), a chatbot built by Google researchers Tao Tu, Anil Palepu, Mike Schaekermann and colleagues, showed better diagnostic ability and bedside manner than doctors in conversations with patients. The conversations covered a range of complaints including cardiovascular, respiratory, gastroenterology, neurology, urology, obstetric, and gynecology conditions. Key insight: A pretrained LLM that’s fine-tuned on conversations between doctors and patients can learn to mimic the doctor’s role. However, such models are limited because available datasets of real-world medical conversations don’t cover the full range of medical scenarios and include ambiguities, interruptions, implicit references and the like, posing difficulties for learning. Conversations generated by a pretrained LLM can cover more conditions in more articulate language. After fine-tuning on real-world conversations, further tuning on generated conversations can improve performance. In addition, after a conversation, critiquing the “doctor’s” performance can improve its ability to render diagnoses, suggest plans for managing symptoms, empathize with patients, and otherwise perform its role. How it works: The authors fine-tuned a pretrained PaLM-2 on medical multiple-choice questions that describe symptoms, possible causes, and evidence for the correct diagnosis, as well as datasets for tasks like summarizing and continuing medical dialogs. They further fine-tuned the model on its own output.
Results: Specialist physicians evaluated the doctor model’s performance in 149 conversations with human actors who played the roles of patients based on scenarios supplied by clinical providers. They compared the model’s output with those of 20 primary care physicians based on their own conversations with the actors.
Why it matters: LLMs can generate fine-tuning data that improves their own performance. By training on relevant, factually correct medical information from the web, LLMs can generate realistic conversations at scale — even in a highly technical, high-stakes discipline like medicine and despite their potential to generate potentially dangerous hallucinations. Used as fine-tuning data, this output enables LLMs to converse with humans more effectively. We're thinking: AI promises to spread intelligence far and wide. As the authors acknowledge, further work remains to demonstrate this work’s efficacy, ethics, security, and regulatory compliance in a clinical setting. Yet it’s an exciting glimpse of a world in which medical intelligence is fast, cheap, and widely available.
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|