Dear friends,
Building AI products and businesses requires making tough choices about what to build and how to go about it. I’ve heard of two styles:
Say you’ve built a customer-service chatbot for retailers, and you think it could help restaurants, too. Should you take time to study the restaurant market before starting development, moving slowly but cutting the risk of wasting time and resources? Or jump in right away, moving quickly and accepting a higher risk of pivoting or failing?
Ready, Aim, Fire tends to be superior when the cost of execution is high and a study can shed light on how useful or valuable a project could be. For example, if your team can brainstorm a few other use cases (restaurants, airlines, telcos, and so on) and evaluate these cases to identify the most promising one, it may be worth taking the extra time before committing to a direction.
Ready, Fire, Aim tends to be better if you can execute at low cost and, in doing so, determine whether the direction is feasible and discover tweaks that will make it work. For example, if you can build a prototype quickly to figure out if users want the product, and if canceling or pivoting after a small amount of work is acceptable, then it makes sense to consider jumping in quickly. (When taking a shot is inexpensive, it also makes sense to take many shots. In this case, the process is actually Ready, Fire, Aim, Fire, Aim, Fire, Aim, Fire.)
After agreeing upon a product direction, when it comes to building a machine learning model that’s part of the product, I have a bias toward Ready, Fire, Aim. Building models is an iterative process. For many applications, the cost of training and conducting error analysis is not prohibitive. Furthermore, it is very difficult to carry out a study that will shed light on the appropriate model, data, and hyperparameters. So it makes sense to build an end-to-end system quickly and revise it until it works well.
Keep learning! Andrew
News
Dances With RobotsTesla unveiled its own AI chip and — surprise! — plans for a humanoid robot. What’s new: At Tesla’s AI Day promotional event, the company offered a first look at an upcoming self-driving computer powered by custom AI chips. To make sure the event got headlines, CEO Elon Musk teased a forthcoming android. Chips and bots: Company executives explained how the company trains models, labels data, and meets various AI challenges. Then they dove into what’s ahead:
Behind the news: Tesla’s Autopilot system has recently come under government scrutiny. Last week, the U.S. National Highway Traffic Safety Administration launched an investigation into 11 incidents in which Tesla vehicles using Autopilot collided with parked emergency vehicles. If the agency finds Autopilot at fault, it could require the company to change or recall its technology. Why it matters: Tesla’s promise of full self-driving capability was premature, but Dojo’s muscled-up computing power could bring it substantially closer. As for the Tesla Bot, we’re not holding our breath. We’re thinking: Tesla’s genuine achievements — the innovative electric car, charging infrastructure, driver-assistance capabilities — may be overshadowed by stunts like the dancer in the bodysuit. History will decide whether Elon Musk is remembered as a genius at engineering or marketing.
Deep UnlearningPrivacy advocates want deep learning systems to forget what they’ve learned. What’s new: Researchers are seeking ways to remove the influence of particular training examples, such as an individual’s personal information, from a trained model without affecting its performance, Wired reported. How it works: Some researchers have experimented with preparing data prior to training for potential removal later, while others have worked to remove the effect of selected examples retroactively.
Behind the news: Evolving data privacy laws could wreak havoc on machine learning models.
Why it matters: Enabling models to unlearn selectively and incrementally would be less costly than retraining repeatedly from scratch. It also could give users more control over how their data is used and who profits from it. We’re thinking: Wait … what was this article about?
A MESSAGE FROM DEEPLEARNING.AI
Mark your calendar: We’re launching “Deploying Machine Learning Models in Production,” Course 4 of the Machine Learning Engineering for Production (MLOps) Specialization, on September 8, 2021! Pre-enroll now
Full-Bodied With Hints of Forest FireWineries in areas affected by wildfires are using machine learning to produce vintages that don’t taste like smoke. What’s new: Some California winemakers are using a service called Tastry to identify grapes tainted by smoke from the state’s surging blazes and recommend blends that will mask the flavor, The Wall Street Journal reported. How it works: Called CompuBlend, Tastry’s system analyzes grapes’ chemical makeup, including smoke compounds absorbed through their skins. A model recommends other varieties that can mask the taste.
Behind the news: The ancient art of winemaking is adopting AI.
Why it matters: Wildfires are a growing threat to wine regions in Australia, California, and France. They cost the industry an estimated $3.7 billion in 2020. AI could help vintners recoup some of the losses. We’re thinking: While there's a clear need to adapt to human-induced climate change, it’s tragic that the planet has heated to the point that formerly temperate areas are burning. We applaud the work of Climate Change AI.
Ask Me in a Different WayPretrained language models like GPT-3 have shown notable proficiency in few-shot learning. Given a prompt that includes a few example questions and answers (the shots) plus an unanswered question (the task), such models can generate an accurate answer. But there may be more to getting good results. What’s new: Ethan Perez, Douwe Kiela, and Kyunghyun Cho subjected GPT-style language models to a test they call true few-shot learning. They found that the heralded few-shot success may depend on a well engineered prompt. The authors are based at New York University, Facebook, and CIFAR, respectively. Key insight: Training a machine-learning model typically requires a validation set to tune hyperparameters such as the learning rate. For GPT-style models, those hyperparameters include the prompt format. In few-shot learning with a pretrained model, the prompt typically contains a handful of examples. However, researchers often experiment extensively to find a prompt format that yields accurate responses. This amounts to stacking the deck in the model’s favor, and without it, such models can’t perform so well. How it works: The authors evaluated four sizes of GPT-3, four sizes of GPT-2, and DistilGPT-2. They tested prompt formats from LAMA, a benchmark that comprises factual statements in a variety of formats, and LPAQA, which contains LAMA statements translated from English into a different language and back.
Results: For all models tested, the accuracy prompted by the format selected according to cross-validation was only marginally above the mean and significantly below the accuracy of the best format. For instance, for the largest model (GPT-3 with 175 billion parameters), the format chosen by cross-validation scored about 55 percent, mean accuracy was about 54 percent, and the accuracy of the best format was about 60 percent. Why it matters: Previous claims of few-shot learning in GPT-style models left out an important variable: the size of the dataset used to pick a good format. Choosing among 12 prompt formats boosted accuracy by around 5 percent; choosing among a larger set of formats could make a bigger difference. If researchers don’t include all the information that went into the results they report, follow-up studies are unlikely to duplicate their work. We’re thinking: We like prompt engineering that gets things done on time. We’re less enamored with prompt engineering that muddies the water around few-shot learning.
Work With Andrew Ng
Software Engineers (Remote): Workera, a precision upskilling company that enables individuals and organizations to identify, measure, interpret, and develop AI skills, is looking for software engineers of all levels. You’ll own the mission-critical effort of implementing and deploying innovative learning technologies. Apply here
Solutions Architect: Workera is looking for a solutions architect to empower its go-to-market team, create a streamlined sales-enabling environment, and accelerate business opportunities. Apply here
Various Roles: Workera seeks an enterprise lead product manager, product design lead, compliance and risk manager, FP&A manager, and senior data engineer. Apply here
Data Engineer (LatAm): Factored is looking for top data engineers with experience in data structures and algorithms, operating systems, computer networks, and object-oriented programming. You must have experience with Python and excellent skills in English. Apply here
Software Development Engineer: Landing AI seeks software development engineers to build scalable AI applications and deliver optimized inference software. A strong background in Docker, Kubernetes, infrastructure, network security, or cloud-based development is preferred. Apply in North America or Latin America.
Machine Learning Engineer (Customer Facing): Landing AI is looking for a machine learning engineer to work with internal and external engineers on novel models for customers. A solid background in machine learning and deep learning with proven ability to implement, debug, and deploy machine learning models is a must. Apply here
Sales Development Representative (North America): Landing AI is looking for a salesperson to generate new business opportunities through calls, strategic preparation, and delivering against quota. Experience with inside sales and enterprise products and a proven track record of achieving corporate quotas is preferred. Apply here
Learning Technologist: DeepLearning.AI seeks a technologist to guide and support learners across the platform. We’re looking for someone with a passion for online learning, teaching, and improving the learner experience. Apply here
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|