Dear friends,
Recently I visited South Korea, where I spoke at length about AI with President Yoon Suk Yeol. Based on what I saw there in government, business, and academia, the nation is well positioned to become a strong AI hub. When he asked me if I would advise South Korea as a member of the Global AI Strategy Steering Group of the country’s National AI Committee, I agreed on the spot. I was delighted to learn this week that Yann LeCun has also joined. I’ve been consistently impressed by the thoughtful approach the Korean government has taken toward AI, with an emphasis on investment and innovation and a realistic understanding of risks without being distracted by science-fiction scenarios of harm.
I’ve advised many countries to build AI for the sectors where they’re strong. For example, I felt that by investing in sectors like tourism and certain industries, Thailand can do projects more efficiently than I can in Silicon Valley. South Korea’s tech ecosystem gives it a foundation to move even faster across multiple sectors. This emphasizes the long-term value for countries to become good at tech, because tech is now pervasive and affects all industries.
Korea has a very strong local software ecosystem. For example, the dominant search engine is not Google or Bing, but Naver (a Korean company). The dominant messaging system is not WhatsApp or WeChat, but KakaoTalk. With local tech giants Naver and Kakao offering email, mobile payment, cloud computing, ride sharing, and other services, the country has many sophisticated tech businesses. Additionally, SK hynix and Samsung are advanced semiconductor manufacturers. It also has a thriving entrepreneurship ecosystem, including Upstage, a language modeling startup, which taught a course with us on “Pretraining LLMs.” Finally, the Korean institutions Seoul National University, which I visited last year, and KAIST have global reputations. Korea has a highly educated population, highly skilled software engineers, and a thriving set of software products. This gives it a fantastic foundation to embrace the next generation of AI. After meeting with businesses in retail, construction, insurance, cosmetics, telecoms, and other industries, I was delighted by the wide variety of opportunities many companies are pursuing across different industry sectors.
Lastly, Korea is known globally for its K-pop. Meeting Bang Si-Hyuk, the chairman of HYBE, which manages the superstar singing group BTS, and learning how the company operates was a real treat! (Another treat was eating at a Korean eel house, where the seafood was unforgettable.)
That’s why I’ve traveled to South Korea four times since last year. My venture studio AI Fund, which collaborates with many Korean companies, has benefited tremendously from the advice of many South Koreans, including Taizo Son, Changmook Kang, Hyungjun Kim, Sung Kim, JP Lee, Ian Park, and Alice Oh. I look forward to doing more in, and with, South Korea! 화이팅 (Let’s go)! Andrew
P.S. We just released the final two courses of AI Python for Beginners! The complete set of four courses is now available and remains free for a limited time. If you know someone who is considering learning to code, please recommend these courses! They teach how to (a) write code using AI-assistance, which is where the field is going, and (b) take advantage of generative AI, which allows you to do valuable things quickly. Since releasing the first two courses, I’ve been inspired by many learner stories like this one. Julia K. started with AI Python for Beginners and shortly afterward wrote useful program after useful program. (She accomplished this before we had even finished releasing all four courses!) I hope many others will have similar stories to tell.
A MESSAGE FROM DEEPLEARNING.AIThe final courses of Andrew Ng’s AI Python for Beginners are live! Work on hands-on projects to analyze data, automate tasks, create reusable functions, and extend Python with third-party tools. Join for free today!
NewsLong Context Gets Up to SpeedA new model generates tokens faster than current transformers, especially when processing long inputs.
Results: Both versions of Jamba 1.5 produced output tokens faster than other models (running on identical hardware), especially given longer inputs. However, the larger version achieved lower performance on popular benchmarks than other open models.
Behind the news: The mamba architecture, which is designed to enable processing to scale linearly with longer input lengths, has been a subject of much research since its release in late 2023. Notably, Mamba-2, Mamba-2-Hybrid, and Zamba combined mamba layers with attention layers with varying degrees of success.
Models Ranked for HallucinationsHow often do large language models make up information when they generate text based on a retrieved document? A study evaluated the tendency of popular models to hallucinate while performing retrieval-augmented generation (RAG). What’s new: Galileo, which offers a platform for evaluating AI models, tested 22 models to see whether they hallucinated after retrieving information from documents of various lengths. Claude 3.5 Sonnet was the overall winner, and most models performed best when retrieving information from medium-length documents. How it works: The researchers tested 10 closed and 12 open models based on their sizes and popularity. They ran each model 20 times using short, medium, and long context lengths (a total of 60 tests) using GPT-4o to evaluate how closely the output text adhered to the context.
Results: Anthropic’s Claude 3.5 Sonnet ranked highest overall, achieving 0.97 in short context lengths and 1.0 in medium and long context lengths.
Behind the news: Galileo performed similar tests last year, when it compared performance in both RAG and non-RAG settings (without differentiating among context lengths). GPT-4 and GPT-3.5 held the top three spots in both settings despite strong showings by Llama 2 and Zephyr 7B. However, the top scores were lower (between 0.70 and 0.77). Why it matters: Model builders have reduced hallucinations, but the difference between rare falsehoods and none at all may be critical in some applications. We’re thinking: It’s curious that medium-length RAG contexts generally yielded fewer hallucinations than short or long. Maybe we should give models more context than we think they need.
AI-Powered Policing Goes NationalArgentina created a national law-enforcement department that will use AI to detect crimes as they’re committed, investigate them afterward, and predict them before they occur. What’s new: President Javier Milei of Argentina established the Artificial Intelligence Unit Applied to Security (UIAAS), The Register reported. The unit aims to detect, investigate, and predict criminal activity by using machine learning algorithms to monitor the internet, wireless communications, security cameras, drone surveillance, financial transactions, and other data in real time. How it works: Milei established the UIAAS in a late-July resolution. Milei created it under the Ministry of Security shortly after he reorganized the national intelligence agency to give himself more direct control. In December, his security minister quashed public protests against his austerity policies; he promised to identify protesters via “video, digital, or manual means” and bill them for the cost of policing the demonstrations.
Behind the news: Argentina’s government is a presidential representative democratic republic. The country was ruled by a military dictatorship between 1976 and 1983.
Why it matters: AI has valuable uses in law enforcement and security. At the same time, it needs to be applied responsibly and implemented in a way that’s fair and respectful of legal rights such as presumption of innocence. We’re thinking: Surveillance is easy to abuse, and the notion of predictive policing warrants extreme caution to avoid bias against certain groups, violating civil rights, and other pitfalls. Ensuring that it’s used well requires robust technology, rigid controls, clear oversight, and public transparency. We hope that Argentina — no less than the countries that inspired it establish a national AI police agency — will put strong safeguards in place.
Making LLMs ExplainableResearchers have probed the inner workings of individual layers of large language models. A new tool applies this approach to all layers. What’s new: Tom Lieberum and colleagues at Google released Gemma Scope, a system designed to illuminate how each layer in Gemma 2-family large language models responds to a given input token. Gemma Scope is available for the 9 billion-parameter and newly released 2 billion-parameter versions of Gemma 2. You can play with an interactive demo or download the weights. Key insight: A sparse autoencoder (SAE) is a sparse neural network that learns to reconstruct its input. The authors drew on earlier research into using SAEs to interpret neural networks.
How it works: The authors built over 400 SAEs, one for each layer of Gemma 2 2B and Gemma 2 9B. They fed Gemma 2 examples from its pretraining set and extracted the resulting embeddings at each layer. Given the resulting embeddings from a specific layer, an SAE learned to reconstruct each of them. An additional loss term minimized the number of non-zero outputs from the SAE’s first layer to help ensure that the SAE used only concepts related to the embedding. To interpret an embedding produced by the first layer of the SAE, the team labeled the embedding’s indices with their corresponding concepts. They used two main methods: manual and automatic.
Behind the news: Earlier research into using SAEs to interpret neural networks was limited to interpreting a single layer or a small network. Earlier this year, Anthropic used an SAE to interpret Claude 3 Sonnet’s middle layer, building on an earlier report in which they interpreted a single-layer transformer. Why it matters: Many questions about how LLMs work have yet to be answered: How does fine-tuning change the way a model represents an input? What happens inside a model during chain-of-thought prompting versus unstructured prompting? Training an SAE for each layer is a step toward developing ways to answer these questions. We’re thinking: In 2017, researchers visualized the layers of a convolutional neural network to show that the deeper the layer, the more complex the concepts it learned. We’re excited by the prospect that SAEs can deliver similar insights with respect to transformers.
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|