The Batch top banner - February 20, 2026

Dear friends,

Will AI create new job opportunities? My daughter Nova loves cats, and her favorite color is yellow. For her 7th birthday, we got a cat-themed cake in yellow by first using Gemini’s Nano Banana to design it, and then asking a baker to create it using delicious sponge cake and icing. My daughter was delighted by this unique creation, and the process created additional work for the baker (which I feel privileged to have been able to afford).

Many people are worried about AI taking peoples’ jobs. As a society we have a moral responsibility to take care of people whose livelihoods are harmed. At the same time, I see many opportunities for people to take on new jobs and grow their areas of responsibility.

We are still early on the path of AI generating a lot of new jobs. I don't know if baking AI-designed cakes will grow into a large business. (AI Fund is not pursuing this opportunity, because if we do, I will gain a lot of weight.) But throughout history, when people have invented tools that unleashed human creativity, large amounts of new and meaningful work have resulted. For instance, according to one study, over the past 150 years, falling employment in agriculture and manufacturing has been “more than offset by rapid growth in the caring, creative, technology, and business services sectors.”

AI is also growing the demand for many digital services, which can translate into more work for people creating, maintaining, selling, and expanding upon these services. For example, I used to carry out a limited number of web searches every day. Today, my coding agents carry out dramatically more web searches. For example, the Agentic Reviewer, which I started as a weekend project and Yixing Jiang then helped make much better, automatically reviews research articles. It uses a web search API to search for related work, and this generates a vastly larger number of web search queries a day than I have ever entered by hand.

Two images show an AI-generated cake design and its baked version, both featuring a yellow cat theme.

The evolution of AI and software continues to accelerate, and the set of opportunities for things we can build still grows every day. I’ve stopped writing code by hand. More controversially, I’ve long stopped reading generated code. I realize I’m in the minority here, but I feel like I can get built most of what I want without having to look directly at coding syntax, and I operate at a higher level of abstraction using coding agents to manipulate code for me. Will conventional programming languages like Python and TypeScript go the way of assembly — where it gets generated and used, but without direct examination by a human developer — or will models compile directly from English prompts to byte code?

Either way, if every developer becomes 10x more productive, I don't think we’ll end up with 1/10th as many developers, because the demand for custom software has no practical ceiling. Instead, the number of people who develop software will grow massively. In fact, I’m seeing early signs of “X Engineer” jobs, such as Recruiting Engineer or Marketing Engineer, which are people who sit in a certain business function X to create software for that function.

One thing I’m convinced of based on my experience with Nova’s birthday cake: AI will allow us to have a batter life!

Keep building,

Andrew

A MESSAGE FROM DEEPLEARNING.AI

Promo banner for: "AI Dev 26 × San Francisco"

The first speakers for AI Dev 26 × San Francisco are confirmed! Hear from leaders shaping AI, join hands-on technical workshops, explore live demos of real-world systems, and discover emerging startups in our new AI Startup Track. View the lineup and secure your ticket today

News

Benchmark table shows GLM-5 outperforming other models in reasoning, coding, and general agent tasks.

GLM-5 Scales Up

Z.ai more than doubled the size of its flagship large language model to deliver outstanding performance among open-weights competitors.

What’s new: GLM-5 is designed for long-running agentic tasks. It tops other open-weights models in Artificial Analysis’ Intelligence Index.

Input/output: Text in (up to 200,000 tokens), text out (up to 128,000 tokens)
Architecture: Mixture-of-experts transformer, 744 billion parameters, 40 billion active parameters per token
Features: Function calling, reasoning, context caching
Performance: Best among open-weights models on Artificial Analysis Intelligence Index, 𝜏²-Bench Telecom, Vending Bench 2, and Chatbot Arena Code
Availability/price: Web interface free, weights available via Hugging Face for commercial and noncommercial uses under MIT license, API $1.00/$0.20/$3.20 per million input/cached/output tokens, coding plans $27 to $216 per quarter
Undisclosed: Specific architecture, training data, and method

How it works: Z.ai disclosed few details about the GLM-5’s architecture and training.

The company pretrained GLM-5 on 28.5 trillion tokens, up from the 23 trillion tokens for GLM-4.5.
For post-training, the company used slime, open-source software for reinforcement learning, originated by Z.ai, in which data generation and training are independent processes. The company says this infrastructure improved training throughput, enabling more iterations during reinforcement learning.
GLM-5 uses DeepSeek sparse attention, which reduces computation over long contexts by processing only the most relevant portions of long inputs rather than attending to every token.

Performance: GLM-5 achieved the highest performance among open-weights models in some coding and agentic tasks but generally trailed proprietary frontier models.

On Artificial Analysis’ Intelligence Index, a weighted average of 10 evaluations that focus on economically useful work, GLM-5 with reasoning enabled (50) surpassed the previous open-weights leader, Kimi K2.5 set to reasoning (47). It trailed Claude Opus 4.6 set to adaptive reasoning (53) and GPT-5.2 set to xhigh reasoning (51).
GLM-5 also showed strength in agentic tasks. On 𝜏²-Bench Telecom, which tests the ability of conversational agents to collaborate with users in technical support scenarios, GLM-5 achieved 98 percent (with reasoning) and 97 percent (without reasoning), while Qwen3-Max-Thinking (98.2 percent) set the state of the art. On Vending-Bench 2, a simulated business scenario designed to measure agentic performance over long contexts, GLM-5 ($4,432.12) outperformed all open-weights models tested including Kimi K2.5 ($1,198.46).
In the Chatbot Code Arena, where human judges compare models head-to-head, GLM-5 (1449 Elo) ranked first among open-weights models. It ranked sixth overall, trailing Claude Opus 4.6 (1567 Elo), tied with Gemini 3 Pro, and outperforming Kimi K2.5 (1447 Elo).

Why it matters: On Artificial Analysis’ Intelligence Index, GLM-5 nearly matches proprietary leaders Claude Opus 4.6 and GPT-5.2. The shrinking gaps between open-weights and proprietary models give developers high-performance options to modify and/or run on their own hardware.

We’re thinking: The center of gravity in open-weights AI has shifted decisively eastward. Developers in China have been responsible for a succession of leading open-weights large language models lately, including GLM 4.5, Kimi K2, Qwen3-VL-235B-A22B, and Kimi K2.5.

Bar chart showing tech lobbying budgets in 2025, with Meta leading at $26.29M, followed by Amazon and Alphabet.

Big AI Spends Big on Lobbying

Top tech and AI companies spent more than $100 million to influence government policy in 2025, the first time they exceeded that figure.

What happened: Meta put $26.29 million into political lobbying last year, more than any other company in any industry, Bloomberg reported. Other big spenders include Amazon ($17.89 million), Alphabet ($13.10 million), and Microsoft ($9.36 million), and Nvidia’s relatively modest budget ballooned to $4.9 million, seven times its size in 2024. Big spenders have been rewarded as the federal government shifted toward more tech-friendly policies, notably support for building data centers and a reversal of the White House’s ban on selling advanced AI chips to China. (Disclosure: Andrew Ng serves on Amazon’s board of directors.)

How it works: Corporate spending on lobbying typically goes into advising officials and drafting legislative proposals, often indirectly through political action committees and industry groups. (Spending to elect favored candidates can be even higher; Meta has allocated $65 million to elect AI-friendly state officials this year, The New York Times reported.) Of the 10 tech companies that spent the most on lobbying last year, several donated to favored White House projects and political organizations. In addition, some companies hired employees who have close relationships with the Trump administration, had their executives attend White House events, and committed to spending on administration priorities.

Meta, Alphabet, Nvidia, AMD, and venture capital firm Andreessen Horowitz raised their lobbying budgets in 2025. Amazon, Microsoft, and Oracle spent slightly less than in 2024. Qualcomm and Intel cut their budgets substantially.
Alphabet, Apple, Meta, and Microsoft pledged to donate funds to rebuild the White House ballroom, a priority of President Trump’s. OpenAI President Greg Brockman and his wife together gave $25 million to the President’s political action committee.
Meta recruited a former Trump adviser as its new president and vice chairman and promoted a former administration official to be its general counsel. OpenAI hired another former Trump adviser to lead its global energy policy.

Tech-friendly policies: Recent changes in national AI policy mirrored the interests of companies that spent the most on lobbying.

No national laws explicitly regulate AI in the United States. However, Meta, OpenAI, and Andreessen Horowitz, all of which are among the top-10 tech companies in terms of lobbying, opposed state-level regulation of AI because they would create a patchwork of laws that would make compliance difficult. In December, President Trump issued an executive order that aims to limit state laws that govern AI.
For years, the federal government has blocked Nvidia from selling its most advanced AI chips to China, depriving the company of an estimated $50 billion in sales. Nvidia increased its lobbying budget to $4.9 million from $640,000, and CEO Jensen Huang met with President Trump a number of times over the year. The president relaxed the ban in July and lifted it in December.
OpenAI, which spent nearly $3 million on lobbying in 2025, up from $1.76 million in 2024, seeks White House support for its plan, known as Stargate, to build an immense network of data centers to process AI. CEO Sam Altman appeared with President Trump the day after his inauguration as the president moved to expedite siting, permitting, and funding of such data centers.
During the first Trump administration, the White House had imposed tariffs on goods imported from China. This placed a burden on Apple, which assembles its products largely in China. In April, the Trump Administration exempted Apple products from tariffs. The following August, Apple agreed to spend $600 billion over four years to build facilities for domestic manufacturing of its products.

Why it matters: Tech companies aren’t the biggest spenders on lobbying. That distinction belongs to healthcare companies. Yet the AI giants’ escalating efforts portend a streamlined regulatory environment while consolidating their power within it. The impact on developers has been largely positive. Lobbying by tech giants appears to have helped alleviate the headache of navigating a patchwork of state laws. The push to build massive infrastructure projects and relax restrictions on chip exports promises a surge in overall compute capacity and hardware stability. However, doing business may become harder for companies that don’t pay to play.

We’re thinking: As industries mature, sometimes they shift from technical meritocracies in which the best tech wins to political arenas in which power dynamics matter at least as much. AI developers increasingly may be channeled into policy frameworks developed by big-tech lobbyists, for better or worse.

Man working on multiple monitors with code and chat bubbles.

Learn More About AI With Data Points!

AI is moving faster than ever. Data Points helps you make sense of it just as fast. Data Points arrives in your inbox twice a week with six brief news stories. This week, we covered Qwen 3.5’s state-of-the-art performance across 200+ languages and Claude Sonnet 4.6 reaching Opus-level performance at a significantly lower price. Subscribe today!

Two comparison tables show AI model performance across varied benchmarks, highlighting LFM2.5-1.2B.

Faster Reasoning at the Edge

Reasoning models in the 1 to 2 billion-parameter range typically require more than 1 gigabyte of RAM to run. Liquid AI released one that runs in less than 900 megabytes, and does it with exceptional speed and efficiency.

What’s new: Liquid AI’s LFM2.5-1.2B-Thinking is designed to run on small devices. It complements base, instruction-tuned, Japanese, vision-language, and audio-language LFM2.5 variants, which debuted in January.

Input/output: Text in (up to 32,768 tokens), text out.
Architecture: Hybrid transformer-convolutional neural network, 1.17 billion parameters
Performance: Matched or exceeded Qwen3-1.7B on most reasoning benchmarks while running twice as fast, requiring less memory, and generating fewer output tokens
Features: Reasoning, tool use, eight languages (English, Arabic, Chinese, French, German, Japanese, Korean, Spanish)
Availability: Free web user interface, weights available for download and licensed for noncommercial and commercial uses to organizations up to $10 million annual revenue
Undisclosed: Training data

How it works: The architecture mixes attention layers with convolutional layers which, given a new token, process only an adjacent group of tokens — rather than the entire input sequence, as attention does — and thus use less computation and memory. Small models can develop issues such as forgetting as they’re trained on successive domains. To overcome such problems, the team trained LFM2.5-12B-Thinking in phases.

The team pretrained the model on 28 trillion tokens, up from 10 trillion for earlier variants.
They introduced step-by-step reasoning data during mid-training, a phase after pretraining that typically uses mid-size datasets to sharpen distinct skills prior to fine-tuning.
They continued with supervised fine-tuning on synthetic reasoning data.
During the reinforcement-learning (RL) phase, the team produced 25 versions of the model specialized for different domains such as reasoning, mathematics, and tool use and merged them into a single model. (The authors don’t describe the model-merging method.) For example, after RL training in tool use, they merged the tool-use version with a math version to restore any degraded math capacity.

Results: On Artificial Analysis’ Intelligence Index, a weighted average of 10 benchmarks, LFM2.5-1.2B-Thinking matched models of similar size and larger size including Qwen3-1.7B in thinking mode.

In tests performed by Liquid AI, LFM2.5-1.2B-Thinking outperformed or matched Qwen3-1.7B in thinking mode on GPQA Diamond, IFEval, IFBench, Multi-IF, GSM8K, MATH-500, BFCLv3. It underperformed that model on MMLU-Pro and AIME 2025.
On all benchmarks mentioned above, LFM2.5-1.2B-Thinking outperformed Google Gemma 3 1B IT, IBM Granite-4.0-1B, IBM Granite-4.0-H-1B (a hybrid transformer/mamba architecture), and Meta Llama 3.2 1B Instruct.
On Liquid AI’s tests of inference speed, LFM2.5-1.2B-Thinking leading. Running on CPUs (Samsung Galaxy S25 Ultra and AMD Ryzen AI Max+ 395), it generated output tokens roughly twice as fast as Qwen3-1.7B (without thinking mode) while using around 45 percent less memory.

Yes, but: Small models struggle with hallucinations, and LFM2.5-1.2B-Thinking underperforms competing models in this regard.

Artificial Analytics’ AA-Omniscience test penalizes hallucinations to evaluate models on a scale between 100 and -100, higher is better. LFM2.5-1.2B-Thinking (-83) came in behind Qwen3-1.7B in thinking mode (-78) and LFM2.5-1.2B-Instruct (-75). In contrast, Qwen3-8B in thinking mode achieved -66 and DeepSeek v3.2 in thinking mode achieved -23.
Consequently, Liquid AI recommends using the model for “agentic tasks, data extraction, and RAG” and against using it for “knowledge-intensive tasks and programming.”

Why it matters: LFM2.5-1.2B-Thinking is well suited to drive on-device agents that orchestrate tool calls, extract data, or query local databases. Such agents need the ability to follow instructions more than encyclopedic knowledge, since they’re likely to fetch external information. They also benefit from speed to handle lengthy chains of requests and a small memory footprint that leaves room for other applications.

We’re thinking: While many developers try to pack the most intelligence into their models, LFM2.5-1.2B strikes a balance between intelligence, inference speed, and memory requirements.

Diagram shows SleepFM's data processing flow from sleep signals to disease prediction using neural networks.

Sleep Signals Predict Illness

Difficulty sleeping often precedes heart disease, psychiatric disorders, and many other illnesses. Researchers used data gathered during sleep studies to detect such conditions.

What’s new: SleepFM is a system that classifies Alzheimer’s, Parkinson’s, prostate cancer, stroke, congestive heart failure, and many other conditions based on a person’s vital signs while asleep — as much as 6 years before they show symptoms. Rahul Thapa and Magnus Ruud Kjaer worked with colleagues at Stanford University, Danish Center for Sleep Medicine, Technical University of Denmark, BioSerenity, Harvard Medical School, and University of Copenhagen.

Input/output: Recordings of one night of sleep in, disease classifications out
Architecture: Convolutional neural network encoder, transformer, LSTM
Performance: Can accurately classify over 130 conditions, including experiencing congestive heart failure or stroke within six years.
Availability: Weights, training code, and inference code are available for download for commercial and noncommercial uses. Part of the dataset is available for noncommercial use.

How it works: SleepFM comprises a convolutional neural network (CNN), transformer, and LSTM. The authors trained the system in two stages: (i) to encode patterns in sleep data and (ii) to classify diseases. The training data comprised roughly 585,000 hours of sleep-study recordings that included, in addition to each patient’s age and sex, signals of activity in the brain, heart, respiratory system (airflow, snoring, and blood oxygen level), and leg muscles. The data was mostly proprietary but included public datasets.

The authors trained the CNN and transformer together. Given 5 minutes of recordings, the CNN learned to produce embeddings of each type of signal, while the transformer modified the embeddings to capture relationships within a signal type across time. The CNN and transformer were encouraged to produce similar embeddings of sleep recordings that were made at the same time and different embeddings of recordings that weren’t.
The authors added the LSTM and separately trained it, given 9 hours of sleep data as well as the subject’s age and sex, to classify more than 1,000 diseases.

Results: The authors compared SleepFM’s performance on a proprietary test set to the same system without pretraining and a vanilla neural network that was trained on only demographic information.

Across 14 general categories of disease, SleepFM achieved a higher area under the curve (AUC), a measure of true versus false positives (positive meaning the condition occurred within six years of the recording), higher is better. For example, classifying post-traumatic stress disorder, SleepFM achieved 0.75 AUC, while the same system without pretraining achieved 0.64 AUC.
Predicting Atrial Fibrillation in the public sleep dataset SHHS, SleepFM achieved 0.81 AUC, while earlier work trained solely for that purpose achieved 0.82 AUC.

Why it matters: AI’s ability to recognize subtle patterns has amazing potential in medicine and beyond. In this application, it could provide early warning of serious diseases, enabling people to take steps to prevent illness before it develops.

We’re thinking: We’re wide awake after reading this paper!

Work with Andrew Ng

Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.

Subscribe and view previous issues here.

Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.