|
Dear friends,
AI-native software engineering teams operate very differently than traditional teams. The obvious difference is that AI-native teams use coding agents to build products much faster, but this leads to many other changes in how we operate. For example, some great engineers now play broader roles than just writing code. They are partly product managers, designers, sometimes marketers. Further, small teams who work in the same office, where they can communicate face-to-face, can move incredibly quickly.
When smaller, AI-enabled teams can get more done, generalists excel. Traditional companies need to pull together people from many specialties — engineering, product management, design, marketing, legal, etc. — to execute projects and create value. This has resulted in large teams of specialists who work together. But if a team of 2 persons is to get work done that require 5 different specialities, then some of those individuals must play roles outside a single speciality. In some small teams, individuals do have deep specializations. For example, one might be a great engineer and another a great PM. But they also understand the other key functions needed to move a project forward, and can jump into thinking through other kinds of problems as needed. Of course, proficiency with AI tools is a big help, since it helps us to think through problems that involve different roles.
Even in a two-person team, to move fast, communication bottlenecks also must be minimized. This is why I value teams that work in the same location. Remote teams can perform well too, but the highest speed is achieved by having everyone in the room, able to communicate instantaneously to solve problems.
Keep building, Andrew
A MESSAGE FROM DEEPLEARNING.AIIn “Spec-Driven Development,” you will learn a disciplined workflow for working with coding agents. Write specs, guide implementation step by step, and stay in control of what you build! Join in for free
News
Life After Llama
Meta pivoted from its open-weights strategy to deliver a closed alternative.
What’s new: Meta introduced its first AI model in a year and the first product of its nine-month-old Superintelligence Labs. Muse Spark is a natively multimodal reasoning model with support for tool use and multi-agent orchestration. It leads in some health and multimodal benchmarks but falls short in coding and agentic work, which Meta frames as validating an architectural redesign on which the company plans to build larger models.
How it works: Meta disclosed limited technical details about Muse Spark but highlighted gains in training efficiency and multi-agent orchestration plus a domain-specific investment in health.
Results: Muse Spark’s benchmark performance is generally competitive and notably token-efficient. Meta acknowledged that it shows gaps in coding and agentic performance.
Behind the news: Muse Spark is the Meta’s first new model since it reorganized its AI labs after critics alleged that the training data for Llama 4 been contaminated with benchmark answers. In June 2025, Meta spent $14.3 billion for a 49 percent stake in Scale AI, brought in cofounder Alexandr Wang as chief AI officer, and launched a hiring spree with pay packages worth hundreds of millions of dollars. The proprietary release has raised concerns among developers, many of whom have built projects on open-weights Llama models.
Why it matters: Meta is investing in the capabilities that matter most for its product ambitions: multimodal perception for billions of camera-equipped users, health reasoning for one of the most common categories of AI queries, and multi-agent coordination for multi-step tasks. With a private API preview in progress, it’s positioning itself to compete for business customers alongside OpenAI, Google, and Anthropic. However, its pivot away from being the leading U.S. champion of open weights is a significant loss for the developer community.
We’re thinking: Muse Spark’s contemplating mode and Kimi K2.5’s Agent Swarm point to an emerging pattern: More labs are scaling performance by training models to orchestrate multiple agents at inference time rather than training ever-larger single models.
Big Pharma Bets Big on AI
Generative AI has proven that it can produce text, images, audio, video, and code. The world’s most valuable pharmaceutical company is betting billions that it can produce drugs as well.
What’s new: Pharma giant Eli Lilly agreed to give as much as $2.75 billion to Insilico Medicine, a Hong Kong-based biotechnology company that applies generative AI across its drug-discovery pipeline. Initially, Lilly will pay $115 million for exclusive rights to develop and sell undisclosed drugs that have not yet been tested in humans, while further payments will be tied to developmental, regulatory, and commercial milestones, Fierce Biotech reported. This is the third agreement between the companies following an AI software license in 2023 and a $100 million research collaboration in November 2025.
AI drug-discovery: Founded in 2014, Insilico has used AI to develop 28 candidate drugs, roughly half of which are in clinical trials. The most advanced one, Rentosertib, targets idiopathic pulmonary fibrosis (IPF), a disease in which scarring progressively reduces lung function. A Phase 2a trial (an early, small-scale test of efficacy) showed positive results. A second drug, Garutadustat, which is intended to treat inflammatory bowel disease, entered Phase 2a in January 2026.
How it works: After choosing a disease, Insilico applies proprietary generative models to two stages of drug discovery: identifying which protein to target and designing a molecule to act on that protein.
Behind the news: Developing a new drug typically takes 10 to 15 years and costs more than $2 billion, and roughly 86 percent of candidates fail to reach approval. A growing number of drug developers apply AI to accelerate the process. A peer-reviewed analysis catalogued 173 AI-enabled drug programs across clinical stages as of mid-2025. Nonetheless, no AI-discovered drug has received regulatory approval. Of the drug candidates that reach Phase 2, 70 percent fail to reach the next phase, including AI-designed drugs from BenevolentAI and Recursion Pharmaceuticals.
Why it matters: Insilico’s pipeline suggests generative AI can tackle one of the hardest problems in science: finding a molecule that binds to a particular protein, is absorbed by the body, isn’t toxic, and helps patients. In Rentosertib’s Phase 2a trial, participants who took the highest dose gained an average of 98.4 milliliters in forced vital capacity (a measure of lung function), while those who took a placebo declined by 20.3 milliliters. That is early but concrete evidence that AI-generated drugs can help patients.
We’re thinking: AI is accelerating drug development, but it remains to be seen whether those accelerated compounds will pass clinical trials at a higher rate than those developed in traditional ways.
Learn More About AI With Data Points!
AI is moving faster than ever. Data Points helps you make sense of it just as fast. Data Points arrives in your inbox twice a week with six brief news stories. This week, we covered how Nvidia is using AI to tackle key challenges in quantum computing, making systems faster, more accurate, and closer to real-world use. Subscribe today!
US States Move Forward With AI Laws
U.S. states are continuing to enact laws that regulate AI, despite President Trump’s efforts to discourage state-by-state legislation in favor of national laws.
What’s new: Many states have moved to regulate AI this year, contributing to a growing patchwork of legislation that stands to complicate developers’ efforts to meet legal requirements. Collectively, the states are considering numerous bills — more than 1,500, by one tally — in addition to more than 100 existing laws enacted by 40 states that are designed to discourage use of chatbots by young people, require permission to train AI systems on copyrighted material, or require security testing of AI systems, The New York Times reported.
How it works: California governor Gavin Newsom has been the most visible opponent of the Trump Administration’s effort to discourage state-by-state regulation of AI. But more than 40 states are in the process of passing their own laws. Among them are:
Behind the news: The Trump Administration started promoting national regulations over state laws as worries grew that a state-by-state patchwork could impede U.S. leadership in AI. In December, President Trump signed an executive order designed to discourage state-level legislation. The order targets laws that would stifle innovation as well as anti-bias regulations that could be perceived to have a political slant. It threatens to withhold federal funds from states that pass or enforce “onerous” AI laws and urges Congress to block state regulations. In March, it followed up with guidelines for federal legislation. The guidelines support protections for children and controls on electricity price hikes driven by AI data centers’ increasing consumption of energy.
Why it matters: An increasingly complex regulatory landscape around AI creates a potential minefield for compliance in the U.S. and contributes to unfocused, contradictory regulation worldwide. A given AI model may be required to pass a bias audit in Colorado, provide watermarking in California, and meet reporting thresholds in New York — all while the federal government moves to preempt these requirements. This jurisdictional tug-of-war increases the cost of building AI systems, adds to the legal risk of deploying new applications and services, and raises the possibility that government funding may be withheld for complying with state mandates the federal government considers onerous.
We’re thinking: Some current state-level mandates are sensible. Users should be able to rely on AI companies to preserve their privacy, for instance, and children should be protected from AI slop generated by and for adults. But such requirements should be imposed at the national level. We call on Congress to build a more cohesive, stable regulatory environment.
Simulating Diverse Human Cohorts
If you want to understand how the public will respond to your offerings, large language models can simulate users who answer questions about capabilities, features, promotions, or prices. However, LLMs don't respond with the range of variations that humans do. Researchers developed a method that prompts LLMs to take on personas with a customizable variety of attitudes.
What’s new: Davide Paglieri, Logan Cross, and colleagues at Google proposed Persona Generators. Their approach produces code that prompts an LLM to compose prompts for 25 personas that cover the map.
Key insight: Making an LLM take on a human persona typically is a matter of composing an effective prompt (for instance, “Answer the following question as if in politics today, you considered yourself a Democrat. . . .”). However, this approach tends to elicit average responses that don’t reflect the range that a human population would provide — even if the prompt explicitly directs the LLM to adopt specific demographic characteristics. An alternative is to direct a model to modify persona prompts programmatically until they produce output that covers a specific range of opinions, attitudes, or concerns. Given guidelines that define the scope of the persona population (specifically attitudes ranked by degrees of agreement to disagreement), an evolutionary algorithm can push the model to produce a set of prompts that elicit the full range of responses.
How it works: The authors used the evolutionary method AlphaEvolve to generate code that (i) generated 25 prompts for personas and (ii) maximized the diversity of their attitudes based on their answers to a set of generated questionnaires.
Results: Given a fresh context and diversity axes, the resulting personas consistently exceeded the diversity metrics of Nemotron Personas, a large dataset of persona prompts that are based on U.S. demographic statistics, and persona prompts produced by a Concordia memory generator based on generated memories from childhood to adulthood. Given a set of test questionnaires, the authors’ personas covered 82 percent of possible responses, while Nemotron Personas covered 76 percent and Concordia memory generator covered 46 percent.
Why it matters: Organizations that aim to expand their audiences can benefit from synthetic personas that broadly reflect public sentiment, and those that create synthetic personas to match their real-world audiences can gain insights from a more diverse crowd. This work shifts the objective from matching training data (which tends to generate the most probable outputs and not the outliers) to covering all desired possibilities. Optimizing the persona generator, rather than individual personas, unlocks a broader representation of likely user behavior.
We’re thinking: Synthetic personas offer an intriguing possibility for navigating the product-management bottleneck, the difficulty of deciding what to build when you can build easily by prompting an LLM.
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|