|
Dear friends,
Should there be a Stack Overflow for AI coding agents to share their learnings with each other?
Moltbook, a Reddit-like social network for agents, grew rapidly with many OpenClaw agents using it, and Meta acquired it earlier this week. I found the conversations among AI agents speculating about all sorts of topics like their “souls” mildly entertaining. I think there’s room for a new type of social media for agents that’s focused on being useful in practical ways.
Stack Overflow has been a great service for developers. It has been a place where we can ask questions, answer questions, and upvote/downvote answers. It turned into a great source of training data for LLMs, and many developers now ask coding questions to LLMs rather than Stack Overflow. But I am inspired by Moltbook and Stack Overflow to think that it will be useful to let coding agents contribute their feedback on documentation so as to help other agents.
Keep building! Andrew
A MESSAGE FROM DEEPLEARNING.AIIn Agentic AI, taught by Andrew Ng, you’ll learn to design multi-step, autonomous workflows in raw Python covering four design patterns: reflection, tool use, planning, and multi-agent collaboration. Available exclusively at DeepLearning.AI. Enroll now!
News
GPT-5.4’s Higher Performance, Higher Price
OpenAI updated its flagship models, extending the ability to use tools and setting the state of the art on a handful of benchmarks, and priced them at the top of the market. Its coding and agentic abilities have enabled Codex, OpenAI’s competitor to Anthropic’s Claude Code, to leap ahead.
What’s new: GPT-5.4 comes in two variants, Thinking and Pro, both with an expanded context window relative to GPT-5.2. (Only two days elapsed between the launch of GPT-5.3 and GPT-5.4, and OpenAI offered no explanation.) GPT-5.4 models are trained to use computers natively and help agents find and use tools more efficiently, a capability called tool search.
How it works: As is typical of closed models, OpenAI disclosed few details about how it built GPT-5.4 and GPT-5.4 Pro. The model is a sparse mixture-of-experts transformer pretrained on text, code, and images scraped from the web alongside licensed materials, user data, and synthetic data. It was fine-tuned via reinforcement learning on datasets that covered multi-step reasoning, solving problems, and proving theorems.
Performance: GPT-5.4 Pro leaped over GPT-5.2 Pro and Claude 4.6 Opus to achieve a number of state-of-the-art metrics in independent tests by Artificial Analysis. But even in OpenAI’s own tests, it underperformed Gemini 3.1 Pro Preview in several tasks and cost more to run the same tests.
Why it matters: OpenAI’s GPT 5.4 has vaulted over Anthropic’s Claude — for the moment — to challenge Google’s Gemini for the top spot. OpenAI touts the GPT-5.4 family’s improved performance per token, but it still requires twice as many tokens as Gemini 3.1 Pro Preview to match the latter’s performance, and its higher efficiency is largely negated by its higher price. GPT-5.4 Pro is a state-of-the art coding model, and costs less than Claude Opus 4.6 to complete coding tasks. But Google’s ability to keep Gemini 3.1 Preview’s price low and overall intelligence high, along with its ability to process audio and video, remains a formidable obstacle for any AI company that aims to become the undisputed leader.
We’re thinking: GPT-5.4 ranks highest on benchmarks developed internally by OpenAI, as you would expect. But these metrics show that the models are built to solve hard problems in automating office work. GPT-5.4 Pro turned in impressive performance on GDPval (83 percent win or tie rate against professionals across knowledge-work tasks like writing legal briefs and customer-support conversations) and OSWorld-Verified (75 percent success rate on computer use tasks like navigating websites and updating spreadsheets from files, above the 72.4 percent human baseline). Given the high human costs of this work, GPT-5.4 Pro, even at xhigh reasoning, may turn out to be a bargain.
AI on Mobile Skyrockets
Downloads of mobile AI apps and resulting revenue are surging.
What’s new: The State of Mobile 2026 report by Sensor Tower, a market research firm, tracks the rapid growth of AI assistants, generative apps, and AI companions on smartphones. Last year, thanks to spending on AI apps, revenue from non-game apps exceeded gaming revenue for the first time, according to the firm’s analysis.
How it works: The authors evaluated the market for mobile AI in 2025. They estimated numbers of downloads, hours of use, and in-app revenue (but not advertising revenue) from the iOS App Store and Google Play based on proprietary data and data from developers. They did not obtain data from other app stores, so the report doesn’t reflect mobile activity in regions such as China, where users download apps mostly from stores run by domestic companies.
Behind the news: Mobile AI assistants are only a few years old, and user behavior is changing quickly. OpenAI introduced its first ChatGPT mobile app in May 2023. Today nearly every major AI assistant is available as an app. Earlier this year, Microsoft found that Copilot users behaved differently on mobile devices and at different times of day. For instance, mobile users were more likely to discuss health and fitness than work and productivity.
Why it matters: AI is becoming habitual for millions of users, not just when they’re working but also when they’re away from their desks and mobile devices are more handy than desktops. In that context, AI apps increasingly compete for time and attention directly with games, social media, and short-form videos. Both time and attention spent lead to more revenue and long-term use.
We’re thinking: The question whether AI-driven revenue will catch up to enormous capital expenditures has led to worries about an AI bubble. This blistering pace of growth in mobile AI revenue is encouraging!
Learn More About AI With Data Points!
AI is moving faster than ever. Data Points helps you make sense of it just as fast. Data Points arrives in your inbox twice a week with six brief news stories. This week, we covered OpenAI’s new GPT-5.4 models that push agent capabilities forward and DeerFlow 2.0, an open-source sandboxed agent framework from ByteDance. Subscribe today!
AI Data Centers Go Off the Grid
Meta and OpenAI are among the tech companies that are building private power plants that will operate independently of regional grids to supply electricity for their massive buildout of AI data centers.
What’s new: Several off-grid power plants associated with data centers are planned or under construction in the United States, according to regulatory filings, permits, transcripts of conference calls with investors, and other documents, The Washington Post reported. Fueled primarily by natural gas, the plants will be connected directly to data centers, sidestepping the oversight and delays that come with grid connections. The Post’s report is based on a study by energy researcher Cleanview that identified 46 projects, 90 percent of them announced in 2025, to build private power plants that are “off the meter,” meaning they supply electricity directly to a customer but also connect to the grid. Together they account for 30 percent of all planned data-center capacity in the U.S. Spurred by the White House, executives at Alphabet, Meta, Microsoft, OpenAI, Oracle, and xAI agreed to shoulder the costs of building power plants and upgrade the grid to soothe worries about rising electricity prices.
How it works: The names of the tech companies that are building their own power plants are largely obscure in the available documents. The projects, which will drive data centers that are expected to consume gigawatts of power, involve collaborations between tech companies, energy infrastructure builders, and/or local power companies. They’re being developed rapidly, in some cases using atypical generators, as conventional turbines are in short supply.
Behind the news: All the major tech companies have been scrambling to lock down access to electricity sufficient to support a data-center buildout that is projected to cost $5.2 trillion and consume 156 gigawatts by 2030.
Why it matters: The rise of private power for data centers reflects a bottleneck as AI companies plan to increase their capacity faster than current energy suppliers can meet. It signals a shift in the ways AI infrastructure, power plants, and public utilities interact, with implications that go well beyond tech companies.
We’re thinking: Until recently, the cloud-computing leaders have leaned heavily on renewable energy sources such as wind and solar power, sometimes with an assist from batteries. Their move toward gas-fired power plants is an unfortunate reversal of the effort to remain carbon-neutral.
Lightning-Fast Diffusion Learning
Research shows that diffusion image generators can train somewhat faster if they learn to reconstruct embeddings from a pretrained encoder that’s built for vision tasks like classification, segmentation, and retrieval — not image generation. Recent work shows they can train dramatically faster if the diffusion model learns to reconstruct a smaller version of these embeddings.
How it works: At inference, FAE works like this: Given noise and an ImageNet class or text embedding, SiT (a latent diffusion model) generates a shrunken image embedding. Given that embedding, an embedding decoder produces a full-size embedding, from which an image decoder produces an image. The authors trained two separate FAE systems to generate images from input text labels (using ImageNet at 256x256-pixel resolution) and text descriptions (using CC12M.)
Results: FAE performed comparably to state-of-the-art diffusion models while training faster.
Why it matters: Encoders trained for vision tasks have different knowledge than encoders trained for image generation. That knowledge can help to generate better images, but it resides in bigger embeddings that require more processing power to handle. Shrinking them makes this knowledge available to image generators in a more practical way, making it possible to generate high-quality images in much less time.
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|