Dear friends,
I wrote earlier about how my team at AI Fund saw that GPT-3 set a new direction for building language applications, two years before ChatGPT was released. I’ll go out on a limb to make another prediction: I think we’ll see significant growth in AI, including Generative AI, applications running at the edge of the network (PC, laptop, mobile, and so on).
Here’s why I think those factors won’t stop AI’s growth at the edge.
Further, strong commercial interests are propelling AI to the edge. Chip makers like Nvidia, AMD, and Intel sell chips to data centers (where sales have grown rapidly) and for use in PCs and laptops (where sales have plummeted since the pandemic). Thus, semiconductor manufacturers as well as PC/laptop makers (and Microsoft, whose sales of the Windows operating system depend on sales of new PC/laptops) are highly motivated to encourage adoption of edge AI, since this would likely require consumers to upgrade their devices to have the more modern AI accelerators. So many companies stand to benefit from the rise of edge AI and will have an incentive to promote it.
Keep learning! Andrew
P.S. My team at Landing AI will present a livestream, “Building Computer Vision Applications,” on Monday, November 6, 2023, at 10 a.m. Pacific Time. We’ll discuss the practical aspects of building vision applications including how to identify and scope vision projects, choose a project type and model, apply data-centric AI, and develop an MLOps pipeline. Register here!
NewsGenerative AI CallingGoogle’s new mobile phones put advanced computer vision and audio research into consumers’ hands. How it works: Google’s new phones process images in distinctive ways driven by algorithms on the device itself. They raise the bar for Apple, the smartphone leader, to turn its internal projects into market opportunities.
Behind the news: Google researchers actively pursued AI systems that alter or enhance images, video, and audio.
Why it matters: Smartphones produce most of the world’s photos and videos. Yet generative tools for editing them have been confined to the desktop, social-network photo filters notwithstanding. Google’s new phones bring the world closer to parity between the capabilities of desktop image editors and hand-held devices. And the audio-editing capabilities raise the bar all around. We’re thinking: Earlier this year, Google agreed to uphold voluntary commitments on AI, including developing robust mechanisms, such as watermarks, that would identify generated media. Will Google apply such a mark to images edited by Pixel users?
Guiding the ScalpelA neural network helped brain surgeons decide how much healthy tissue to cut out when removing tumors — while the patients were on the operating table. What’s new: Researchers from Amsterdam University Medical Centers and Princess Máxima Center for Pediatric Oncology in the Netherlands built a system to assess how aggressively surgeons should treat tumors. It worked accurately and quickly enough to enable doctors to adjust their approach in the operating room. How it works: The authors trained a system of four vanilla neural networks to classify brain tumors.
Results: The authors’ system performed well on tumor DNA samples in an existing collection as well as those gathered in an operating room. Tested on samples from 415 tumors, it classified 60.7 percent of them accurately, misclassified 1.9 percent, and was unable to classify 37.3 percent. Tested on samples collected during 25 real surgeries, it correctly classified 18 tumors and was unable to classify 7. In all cases, it returned results within 90 minutes (45 minutes to collect the DNA and 45 minutes to analyze it). Why it matters: 90 minutes is fast enough to inform brain surgeons what kind of tumor they’re dealing with in the early phase of an operation. If this technique can be rolled out widely, it may help save many lives.
A MESSAGE FROM DEEPLEARNING.AI“Generative AI for Everyone,” taught by Andrew Ng, is coming soon! This course demystifies generative AI and assumes no prior experience in coding or machine learning. Learn how generative AI works, how to use it, and how it will affect jobs, businesses, and society. Join the waitlist
Cost Containment for Generative AIMicrosoft is looking to control the expense of its reliance on OpenAI’s models. What’s new: Microsoft seeks to build leaner language models that perform nearly as well as ChatGPT but cost less to run, The Information reported. How it works: Microsoft offers a line of AI-powered tools that complement the company’s flagship products including Windows, Microsoft 365, and GitHub. Known as Copilot, the line is based on OpenAI models. Serving those models to 1 billion-plus users could amount to an enormous expense, and it occupies processing power that would be useful elsewhere. To manage the cost, Microsoft’s developers are using knowledge distillation, in which a smaller model is trained to mimic the output of a larger one, as well as other techniques.
Behind the news: Microsoft has invested $10 billion in OpenAI. The deal promises the tech giant 75 percent of OpenAI’s operating profit until its investment is repaid, then 49 percent of further profits until reaching an unspecified cap. Meanwhile, Microsoft does have access to high-performing models from other sources. Its Azure cloud platform serves Meta’s LLaMA 2. Why it matters: Serving large neural networks at scale is a challenge even for Microsoft, which has immense hardware resources and a favorable agreement with OpenAI. Running distilled and fine-tuned models can cut the cost for both tech giants and tiny startups. We’re thinking: If users like Copilot so much they're running up a large bill in model inferences, that sounds like a positive sign!
Better Reasoning from ChatGPTYou can get a large language model to solve math problems more accurately if your prompts include a chain of thought: an example that solves a similar problem through a series of intermediate reasoning steps. A new approach to this sort of prompting improved ChatGPT’s accuracy on a variety of reasoning problems. What's new: Jiashuo Sun and colleagues at Xiamen University, Microsoft, and IDEA Research, introduced iterative bootstrapping in chain-of-thought-prompting, a method that prompts a large language model to generate correct chains of thought for difficult problems, so it can use them as guides to solving other problems. Key insight: Researchers have developed a few ways to prompt a large language model to apply a chain of thought (CoT). The typical method is for a human to write an example CoT for inclusion in a prompt. A faster way is to skip the hand-crafted example and simply instruct the model to “think step by step,” prompting it to generate not only a solution but its own CoT (this is called zero-shot CoT). To improve zero-shot CoT, other work both (i) asked a model to “think step by step” and (ii) provided generated CoTs (auto-CoT). The weakness of this approach is that the model can generate fallacious CoTs and rely on them when responding to the prompt at hand, which can lead to incorrect responses. To solve this problem, we can draw example prompts from a dataset that includes correct responses, and the model can check its responses against the dataset labels. If it’s wrong, it can try repeatedly until it answers correctly. In this way, it generates correct CoT examples to use in solving other problems. How it works: To prompt ChatGPT to reason effectively, the authors built a database of example problems, chains of thought, and solutions. They drew problems from 11 datasets: six arithmetic reasoning datasets (such as grade-school math word problems), four common-sense reasoning datasets (for example, questions like “Did Aristotle use a laptop?”), and a symbolic reasoning dataset consisting of tasks that involved manipulating letters in words (for instance, “Take the last letters of the words in ‘Steve Sweeney’ and concatenate them”).
Results: The authors evaluated their method versus hand-crafting and auto-CoT. Of the 11 datasets, their method achieved the best results on 8. For example, on grade-school math word problems, ChatGPT prompted using their method achieved 73.6 percent accuracy; using hand-crafted prompts, it achieved 69.3 percent accuracy, and using auto-CoT, it achieved 71.4 percent accuracy. Their method underperformed hand-crafted prompts on two common-sense reasoning datasets (76.8 percent versus 77.1 percent and 69.3 percent versus 71.1 percent). It underperformed auto-CoT on one arithmetic dataset (91.9 percent versus 92.5 percent.) Why it matters: Large language models have powerful latent capabilities that can be activated by clever prompting. ChatGPT was able to solve the problems in the authors’ database, but only after multiple tries. Prompting it with examples of its own correct solutions to these problems apparently enabled it to solve other, similarly difficult problems without needing multiple tries. We're thinking: It may be possible to modify this method to make human input unnecessary by asking the model to fix the problems in its previous generations or use external tools to validate its outputs.
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|