Dear friends,
Last week, I spoke about AI and regulation at the U.S. Capitol at an event that was attended by legislative and business leaders. I’m encouraged by the progress the open source community has made fending off regulations that would have stifled innovation. But opponents of open source are continuing to shift their arguments, with the latest worries centering on open source's impact on national security. I hope we’ll all keep protecting open source!
The latest argument for blocking open source AI has shifted to national security. AI is useful for both economic competition and warfare, and open source opponents say the U.S. should make sure its adversaries don’t have access to the latest foundation models. While I don’t want authoritarian governments to use AI, particularly to wage unjust wars, the LLM cat is out of the bag, and authoritarian countries will fill the vacuum if democratic nations limit access. When, some day, a child somewhere asks an AI system questions about democracy, the role of a free press, or the function of an independent judiciary in preserving the rule of law, I would like the AI to reflect democratic values rather than favor authoritarian leaders’ goals over, say, human rights.
Looking beyond the U.S. federal government, there are many jurisdictions globally. Unfortunately, arguments in favor of regulations that would stifle AI development continue to proliferate. But I’ve learned from my trips to Washington and other nations’ capitals that talking to regulators does have an impact. If you get a chance to talk to a regulator at any level, I hope you’ll do what you can to help governments better understand AI.
NewsCoding Assistance Start to FinishGitHub Copilot’s latest features are designed to help manage software development from plan to pull request. What’s new: GitHub unveiled a preview of Copilot Workspace, a generative development environment that’s designed to encompass entire projects. Users can sign up for a waitlist to gain access to Workspace until the preview ends. Afterward, Copilot Workspace will be available to subscribers to GitHub Copilot (which starts at $10 per month for individuals and $19 per month for businesses). How it works: Copilot Workspace is based on GPT-4 Turbo and integrated with GitHub code repositories and libraries. Where GitHub Copilot previously generated code snippets and provided suggestions for editing code segments, Copilot Workspace integrates these tasks within a larger plan.
Yes, but: Initial users noted that Copilot Workspace is best at solving straightforward, well defined problems and struggles with more complex ones. Choices can be difficult to unwind later on, and the system is slower than simpler AI coding assistants. Behind the news: Generative coding assistants quickly have become central tools for software development. Copilot has attracted 1.3 million paid subscribers as of April 2024, including 50,000 businesses. Amazon’s Q Developer (formerly CodeWhisperer), Google’s Gemini Code Assist (formerly Duet AI), and Cursor offer coding companions that integrate with or fork popular integrated development environments like Microsoft’s VSCode. On the frontier are agentic tools that plan and carry out complex, multi-step coding tasks. We’re thinking: There are many ways to use AI in coding. To learn about a few more, check out our short course, “Pair Programming With a Large Language Model,” taught by Google AI advocate Laurence Moroney.
OpenAI Licenses News ArchivesOpenAI has been making deals with publishers to gain access to high-quality training data. It added Financial Times to the list. What’s new: OpenAI licensed the archive of business news owned by Financial Times (FT) for an undisclosed sum. The agreement lets OpenAI train its models on the publisher’s articles and deliver information gleaned from them. This is OpenAI’s fifth such agreement with major news publishers in the past year. How it works: Although the parties didn’t disclose the length of their agreement, OpenAI’s other news licensing deals will end within a few years. The limited commitment suggests that these arrangements are experimental rather than strategic. The deal includes articles behind the publisher’s paywall; that is, not freely available on the open internet. This enables OpenAI to train its models on material that competitors may not have. Other deals have given OpenAI exclusive access, shutting competitors out.
Behind the news: Archives of news articles may be handy if OpenAI proceeds with a rumored search service reported by in February by The Information. Licensing is a way to get such material that is unambiguously legal. Although AI researchers commonly scrape data from the web and use it for training models without obtaining licenses for copyrighted works, whether a license is required to train AI models on works under copyright in the U.S. has yet to be determined. Copyright owners lately have challenged this practice in court. In December 2023, The New York Times sued OpenAI and Microsoft, claiming that OpenAI infringed its copyrights by training models on its articles. In April 2024, eight U.S. newspapers owned by Alden Global Capital, a hedge fund, filed a lawsuit against the same defendants on similar grounds. Licensing material from publishers gives OpenAI access to their works while offering them incentives to negotiate rather than sue. We’re thinking: We support efforts to clarify the legal status of training AI models on data scraped from the web. It makes sense to treat the open web pages and paywalled content differently, but we advocate that AI models be free to learn from the open internet just as humans can.
NEW FROM DEEPLEARNING.AIExplore the newest additions to our short courses with “Quantization in Depth,” where you’ll build a quantizer in PyTorch, and “Building Agentic RAG with LlamaIndex,” which teaches how to build agents capable of tool use, reasoning, and making decisions based on your data. Sign up now!
Landmine RecognitionAn AI system is scouring battlefields for landmines and other unexploded ordnance, enabling specialists to defuse them. What’s new: The military hardware firm Safe Pro Group developed Spotlight AI, a computer vision system that identifies mines based on aerial imagery, IEEE Spectrum reported. Nongovernmental organizations that remove landmines, including the Norwegian People's Aid and the HALO Trust, are using the system in Ukraine. How it works: SpotlightAI processes visual-light imagery taken by flying drones. The system provides centimeter-resolution maps that guide mine-removal teams through the territory.
Behind the news: In addition to drones, satellites can help machine learning models to find deadly remnants of warfare. In 2020, Ohio State University researchers estimated the number of undetonated explosives in Cambodia by collating bomb craters in satellite images identified by a computer vision model with records of U.S. military bombing campaigns in that country in the 1960s and 1970s. Why it matters: Unexploded mines, bombs, and other types of munitions killed or injured more than 4,700 people — 85 percent of them civilians and half of them children where military status and age were known — in 2022 alone. Efforts to remove every last mine from a former battlefield likely will continue to rely on traditional methods — manual analysis of overhead imagery along with sweeps by human specialists and explosive-sniffing dogs — but machine learning can significantly reduce the hazard and accelerate the work. We’re thinking: Although this system locates unexploded mines and shells, removing them often still falls to a brave human. We hope for speedy progress in robots that can take on this work as well.
Streamlined InferenceIt’s not necessary to activate all parts of a large language model to process a given input. Using only the necessary parts saves processing. What’s new: Zichang Liu and collaborators at Rice University, Zhe Jiang University, Stanford, University of California San Diego, ETH Zürich, Adobe, Meta, and Carnegie Mellon proposed Deja Vu, an algorithm that accelerates inferencing of large language models (LLMs) by using small vanilla neural networks to predict which parts of it to use. Key insight: Transformer-based neural networks can save a lot of time at inference by activating only a fraction of (i) attention heads and (ii) neurons in fully connected layers. But it’s necessary to activate the right neurons, because different parts of the network learn about different patterns of inputs. By using the input to decide which parts of the network to activate, the network can maintain accuracy using only the parts relevant for the current input. How it works: The authors used pretrained OPT models of various sizes (175, 66, and 30 billion parameters). They built a dataset by feeding examples from OpenBookQA and Wiki-Text to the OPTs and recording the outputs of all attention heads and fully-connected-layer neurons. By activating various portions of these networks, they learned that, for a given input, they could discard most of an OPT’s lowest-output attention heads and fully-connected-layer neurons without degrading its performance.
Results: Deja Vu (175 billion parameters) produced a sequence of 128 tokens in 20 milliseconds, while an Nvidia implementation of OPT of the same size needed 40 milliseconds and a Hugging Face implementation of OPT of the same size needed 105 milliseconds. Moreover, Deja Vu achieved these speedups without reducing accuracy. On WikiText and C4, Deja Vu’s ability to predict the next word held steady while activating 25 percent of attention heads and fully-connected-layer neurons. On datasets such as WinoGrande and OpenBookQA, it maintained its accuracy while activating 35 percent of attention heads and fully-connected-layer neurons. Why it matters: Efficient use of processing power becomes increasingly important as models become larger. Moreover, faster token generation benefits agentic workflows, which can consume large numbers of tokens. We’re thinking: Deja Vu’s design is in the spirit of the mixture of experts (MoE) architecture: For each transformer layer, MoE uses a neural-network layer to choose which fully connected layer to use. In contrast, for each attention head and fully-connected-layer neuron, Deja Vu uses small neural networks to decide which to activate.
NEW FROM DEEPLEARNING.AINew course with Qualcomm coming soon! In “Introduction to On-Device AI,” you’ll learn how to deploy AI models on edge devices using local computation for faster inference and privacy. Join the next wave of AI as models go beyond the cloud. Sign up for the waitlist!
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|