|
Dear friends,
We just released a Skill Builder tool to help you understand in which areas of AI you’re strong, where you can learn more, and what to do next to keep building your skills. I invite you to have a conversation with it.
When I’m learning a new skill, I find it hard to understand where I stand in the field, since I don’t yet know what I don’t know. Skill Builder addresses this for AI skills. It’s free for everyone to use, and many have reported finding the conversations informative. Following the conversation, it will show everyone a summary report and recommend what to learn next. DeepLearning.AI Pro members additionally get more-detailed personalized feedback.
Keep building your skills! Andrew
A MESSAGE FROM DEEPLEARNING.AIYou don’t need to learn how to code to build an app. In Build with Andrew, Andrew Ng shows how to turn ideas into working web apps using simple instructions. Perfect for beginners and easy to share with someone who has been waiting to start. Explore the course today!
News
Gemini Takes the Lead
Google updated its flagship Gemini model, topping several benchmarks while undercutting competitors on performance per dollar.
What’s new: Google launched Gemini 3.1 Pro Preview at the same price as its predecessor Gemini 3 Pro Preview. Gemini 3.1 Pro Preview is the basis of recent performance gains by Gemini 3 Deep Think, a specialized reasoning mode separate from the three reasoning levels available via API.
How it works: Google disclosed few details about Gemini 3.1 Pro Preview. The model is a sparse mixture-of-experts transformer pretrained on text, code, images, audio, and video scraped from the web alongside licensed materials, Google user data, and synthetic data. It was fine-tuned via reinforcement learning on datasets that covered multi-step reasoning, solving problems, and proving theorems. Its model card points readers to the Gemini 3 Pro model card.
Performance: Gemini 3.1 Pro Preview achieved a variety of state-of-the-art metrics in tests performed by Artificial Analysis. However, it trailed on some measures of agentic behavior and user-preference rankings. Some sources of test results don’t specify a reasoning setting; API calls to Gemini 3.1 Pro Preview default to high reasoning.
Why it matters: Gemini 3.1 Pro’s gains appear to stem more from improved model quality than additional computation during inference: In completing the Artificial Analysis Intelligence Index, it consumed roughly the same number of tokens as its predecessor, yet it scored significantly higher. This suggests that refining models can still yield significant performance gains without inflating inference costs.
We’re thinking: On ARC-AGI-2, the performance of Gemini-3.1 Pro Preview — presumably set to high reasoning — is less than 10 percent shy of Gemini 3.1 Deep Think’s (77 percent versus 85 percent) but 13 times less expensive ($0.96 per task versus $13.62 per task). That’s an incentive to reserve Deep Think for the very hardest problems.
Global AI Summit Shows Optimism
The fourth global AI summit marked a decisive shift from focusing on theoretical hazards to spreading AI’s benefits throughout the world.
What’s new: The AI Impact Summit showcased India’s ambition to serve as a counterweight to the United States and China. This year’s gathering of government officials, business leaders, and researchers took place in New Delhi from February 16 to February 20.
How it works: Billed as the first global AI summit to be hosted in the global south, the conference attracted hundreds of thousands of participants and representatives of more than 100 countries. The leaders of India, Brazil, France, Spain, Bolivia, Mauritius and Sri Lanka were in attendance, as well as UN Secretary-General António Guterres, Director of the White House Office of Science and Technology Policy Michael Kratsios, and star CEOs including Alphabet’s Sundar Pichai, OpenAI’s Sam Altman, and Anthropic’s Dario Amodei. But one participant reported that “Chinese participation was almost nonexistent,” as the schedule overlapped with Chinese New Year celebrations.
Behind the news: India is in the spotlight as major AI companies have pledged to invest there and the national government ramps up its own AI spending.
Why it matters: Advancing AI is a global effort, and communication among national governments is an important part. This year’s summit focused on realistic issues like ensuring access to processing and connectivity and encouraging competition in the market — a welcome change from the unrealistic science-fiction worries that dominated the initial event in 2023. This year’s optimistic atmosphere signals that the 2024 and 2025 summits helped attendees recognize AI’s value. At the same time, critics highlighted the ongoing challenge of aligning AI’s rapid build-out with democratic values.
We’re thinking: It’s important that global leaders get together and keep talking. We’re glad to see that the AI summit remains ongoing and governments are aiming to use AI for the benefit of all.
Learn More About AI With Data Points!
AI is moving faster than ever. Data Points helps you make sense of it just as fast. Data Points arrives in your inbox twice a week with six brief news stories. This week, we covered Gemini 3.1 Pro topping the Artificial Analysis Intelligence Index again, and Anthropic facing a U.S. government ultimatum over military access to Claude. Subscribe today!
Investors Panic Over Agentic AI
Makers of software that runs large companies saw their share prices plunge as investors worried that AI systems could undermine their businesses. This week, their stocks rebounded somewhat as Anthropic partnered with some of the same companies.
What’s new: Investors, alarmed by the prospect that AI-enabled coding systems could reproduce popular software tools, drove down the S&P Software & Services Index, which includes software giants such as Microsoft, Oracle, Salesforce, and Workday. The index lost 25 percent of its value between January 12, when Anthropic introduced Claude Cowork, an agent designed for professional work, and February 23, when it showed signs of recovering.
SaaSpocalypse now: The stock selloff affected mostly vendors of software subscriptions via the web, a business known as software as a service (SaaS). Jeffrey Favuzza, a strategist at the investment firm Jefferies Financial Group, dubbed the event the “SaaSpocalypse.”
Behind the news: With the rise of AI-assisted coding, observers have suggested that AI could disrupt traditional software either by replicating its capabilities or enabling agents to replace human users. In the latter scenario, AI would end the “lock-in” effect in which customers remain loyal to a particular service because they don’t want to adjust to a different vendor’s user interface or workflow. In December, Lee Robinson, vice president of developer education at Cursor, which makes AI-assisted coding tools, wrote that his company had completely replaced a content management system it previously paid for, Sanity, with a custom setup it built from scratch. The company now manages its web pages using git and saves tens of thousands of dollars in recurring fees. Sanity spokesman Knut Melvær wrote a public reply noting that Sanity’s product serves purposes, such as facilitating collaboration, that can’t easily be replicated using Cursor’s setup.
Why it matters: Investors may have panicked, but their attention isn’t misplaced: AI is changing the software market. Nonetheless, many SaaS companies will continue to thrive, and new opportunities will continue to emerge. Large language models can dissolve some competitive barriers, but others remain solid, as Fintool CEO Nicolas Bustamante explains in an insightful social media post. Agents can operate unfamiliar user interfaces, navigate complex business processes, access public datasets, and collapse expertise in multiple areas into one application. On the other hand, systems based on LLMs can’t necessarily replace SaaS offerings that rely on proprietary data, regulatory compliance, network effects, or embedded transactions. The message of the SaaSpocalypse is not that software is dead. It’s that small teams can build competitive products rapidly, and the products that have staying power will be built on resources that are beyond the reach of LLMs.
We’re thinking: SaaS isn’t dying, it’s becoming AI-native.
Can Local AI Stand In for the Cloud?
Projected demand for output from large language models is spurring a massive buildout of data centers. Researchers asked whether smaller models running on local devices could meaningfully lighten that load.
What’s new: Jon Saad-Falcon, Avanika Narayan, and colleagues at Stanford University and Together AI, a provider of software development and training, found that laptops are increasingly capable of substituting for cloud computing, based on a metric they call intelligence per watt.
Key insight: Cloud systems are typically more energy-efficient per user than local systems, but smaller, high-performance models increasingly enable local systems to run more efficiently. In a previous era, processing shifted from mainframes to personal computers when personal computers could perform well enough while using the same or less energy. Similarly, AI workloads can shift from data centers to personal devices if smaller models running on laptops can provide sufficient accuracy while using less energy per query. We can measure the viability of local versus cloud computing by computing intelligence per watt: the accuracy on a given task divided by the power consumed to achieve it. Assuming local and cloud systems achieve similar accuracy, the one with the higher intelligence per watt is a more efficient choice.
How it works: The authors ran various open-weights large language models on hardware designed for laptops and data-center servers. To measure the trend in intelligence per watt over time, they included both recent models (from the Qwen3, GPT-OSS, Gemma3, and IBM Granite 4.0 families, late-2025 vintage) and older models (Mixtral-8x7B and Llama -3.1-8B, circa 2023-2024), recent processors (including the Apple M4 Max laptop chip and Nvidia H100 data-center chip) and older processors (like the 2018-vintage Nvidia Quadro RTX 6000). They fed the models 1 million queries from real-world conversations, science, and academic disciplines.
Results: Local systems don’t yet match cloud systems for intelligence per watt, but they are improving as researchers develop smaller models that achieve higher performance. If local systems are as accurate as cloud systems, routing queries to them could save substantial amounts of energy.
Yes, but: The authors did not assess the intelligence per watt of proprietary models like OpenAI GPT-5, likely because it’s unclear how much power they use. However, they did compare the accuracy of proprietary models. The most accurate local model (Qwen3-14B) trailed behind GPT-5, Gemini-2.5-Pro, and Claude Sonnet 4.5 by 11 percent accuracy to 13 percent accuracy.
Why it matters: Researchers are improving large language models rapidly, making higher performing models for the same amount of power usage. Tracking this performance increase reveals the relative trade-off between power and performance over time. As that tradeoff tilts more and more towards low-power devices, people have more options. These options offer the potential to spread the computational load and enable machine intelligence to be distributed more widely.
We’re thinking: Privacy often drives the conversation around local AI. The prospect of rising intelligence per watt creates an intriguing economic argument.
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|