Dear friends,
With the rise of software engineering over several decades, many principles of how to build traditional software products and businesses are clear. But the principles of how to build AI products and businesses are still developing. I’ve found that there are significant differences, and I’ll explore some of them in this and future letters.
Complex product specification. The specification for a traditional web app might come in the form of a wireframe, but you can’t draw a wireframe to indicate how safe a self-driving car must be. It’s extremely complex to specify operating conditions (sometimes also called the operational design domain) and acceptable error rates under various conditions. Similarly, it can be hard to write a spec for a medical diagnosis tool, depending on how acceptable different types of errors are (since not all errors are equally severe). Further, product specs often evolve as the team discovers what is and isn’t technically feasible. Need for data. To develop a traditional software product, you might (a) interview users to make sure they want what you aim to build, (b) show them a wireframe to make sure your design meets their needs, and (c) dive into writing the code. If you’re building an AI product, you need to write code, but you also need access to data to train and test the system. This may not be a big challenge. For a consumer product, you may be able to start with a small amount of data from an initial cohort of users. But for a product aimed at business customers — say, AI to optimize shipping or help a hospital manage its medical records — how can you get access to shipping data or medical records? To work around this chicken-and-egg problem, some AI startups start by doing consulting or NRE (non-recurring engineering) work. Those activities are hard to scale, but they afford access to data that can shape a scalable product.
Additional maintenance cost. For traditional software, the boundary conditions — the range of valid inputs d — are usually easy to specify. Indeed, traditional software often checks the input to make sure, for example, it’s getting an email address in a field dedicated to that input. But for AI systems, the boundary conditions are less clear. If you have trained a system to process medical records, and the input distribution gradually changes (data drift/concept drift), how can you tell when it has shifted so much that the system requires maintenance?
Keep learning, Andrew
NewsWhere There’s Smoke, There’s AIAn automated early warning system is alerting firefighters to emerging blazes. What’s new: South Korean company Alchera trained a computer vision system to monitor more than 800 fire-spotting cameras in Sonoma County, California, the local news channel ABC7 reported. How it works: Alchera’s Artificial Intelligence Image Recognition (AIIR) spots smoke plumes caught on camera by a portion of California’s Alert Wildfire network. A convolutional neural network flags video frames in which it recognizes smoke plumes, and an LSTM analyzes the time series to confirm the classification. If smoke is confirmed, an alarm alerts an operator at a central monitoring station.
Behind the news: Last year, California firefighters used AI to convert aerial imagery into maps to monitor fires that might endanger Yosemite National Park. Wildfires threaten as many as 4.5 million U.S. homes and have wrought havoc in Australia, Pakistan, Russia, and other countries in recent years. Why it matters: While other wildfire-detection systems rely on sporadic aerial or satellite photos, this one watches continuously via cameras at ground level, enabling it to recognize hazards early and at lower cost. We’re thinking: This is one hot application!
Synthetic Videos on the DoubleUsing a neural network to generate realistic videos takes a lot of computation. New work performs the task efficiently enough to run on a beefy personal computer. What’s new: Wilson Yan, Yunzhi Zhang, and colleagues at UC Berkeley developed VideoGPT, a system that combines image generation with image compression to produce novel videos. Key insight: It takes less computation to learn from compressed image representations than full-fledged image representations. How it works: VideoGPT comprises a VQ-VAE (a 3D convolutional neural network that consists of an encoder, an embedding, and a decoder) and an image generator based on iGPT. The authors trained the models sequentially on BAIR Robot Pushing (clips of a robot arm manipulating various objects) and other datasets.
Results: The authors evaluated VideoGPT’s performance using Frechet Video Distance (FVD), a measure of the distance between representations of generated output and training examples (lower is better). The system achieved 103.3 FVD after training on eight GPUs. The state-of-the-art Video Transformer achieved 94 FVD after training on 128 TPUs (roughly equivalent to several hundred GPUs). Why it matters: Using VQ-VAE to compress and decompress video is not new, but this work shows how it can be used to cut the computation budget for computer vision tasks. We’re thinking: Setting aside video generation, better video compression is potentially transformative given that most internet traffic is video. The compressed representations in this work, which are tuned to a specific, sometimes narrow training set, may be well suited to imagery from security or baby cams.
A MESSAGE FROM DEEPLEARNING.AIYou’re invited! On June 30, 2021, we’ll celebrate the launch of Course 3 in the Machine Learning Engineering for Production (MLOps) Specialization featuring our instructors and leaders in MLOps. Join us for this live event!
Machine Learning for Human LearnersAI is guiding admissions, grading homework, and even teaching classes on college campuses. What’s new: In a bid to cut costs, many schools are adopting chatbots, personality-assessment tools, and tutoring systems according to The Hechinger Report, an online publication that covers education. Critics worry that these systems may cause unseen harm. What they found: AI is used to help manage students at nearly every step in gaining higher education.
Yes, but: Some observers say these systems may be giving inaccurate grades, contributing to bias in admissions, or causing other types of harm.
Why it matters: The pandemic exacerbated an ongoing decline in U.S. university enrollment, which has left colleges scrambling. Automated systems that are carefully designed and sensibly deployed could help streamline processes, reduce costs, and increase access. We’re thinking: AI has its place on campus. For instance, chatbots can help students figure out where their classes meet. The technology doesn’t yet offer a substitute for good human judgement when it comes to sensitive tasks like assessing performance, but if it can show consistently fair and accurate judgement, it could help reduce the noise that currently afflicts human grading.
Sorting Shattered TraditionsComputer vision is probing the history of ancient pottery. What’s new: Researchers at Northern Arizona University developed a machine learning model that identifies different styles of Native American painting on ceramic fragments and sorts the shards by historical period. How it works: The researchers started with an ensemble of VGG16 and ResNet50 convolutional neural networks pretrained on ImageNet. They fine-tuned the ensemble to predict pottery fragments’ historical period.
Results: In tests, the model classified tens of thousands of unlabeled fragments. It scored higher than two experts and roughly equal to two others. Behind the news: AI is helping archaeologists discover long-lost civilizations and make sense of clues they had already uncovered.
Why it matters: For human archaeologists, learning to recognize the patterns on ancient pottery takes years of practice, and they often disagree on a given fragment’s provenance. Machine learning could sift through heaps of pottery shards far more quickly, allowing the humans to focus on interpreting the results. We’re thinking: Even when experts correctly identify a fragment, they can’t always explain what features led them to their conclusion. Heat maps from machine learning models could help teach the next generation of archaeologists how to read the past.
A MESSAGE FROM DEEPLEARNING.AIIn “Analyze Datasets and Train ML Models Using AutoML,” Course 1 in our new Practical Data Science Specialization, you’ll learn foundational concepts for exploratory data analysis (EDA), automated machine learning (AutoML), and text classification algorithms. Enroll now Work With Andrew Ng
VP of Marketing: Podcastle is looking for a marketer to shape its brand marketing system from the ground up. As an early member of this growing startup, you’ll play a strategic role in shaping the direction of the company. Apply here
Data Engineer (Latin America): Factored is looking for top data engineers with experience with data structures and algorithms, OS, computer networks, and object-oriented programming. You must have experience with Python and excellent English skills. Apply here
Data Analyst (Latin America): Factored is looking for expert data analysts to analyze datasets, gain insights using descriptive modeling, and code nontrivial functions. A strong background in coding, specifically in Python or R, and excellent English skills are required. Apply here
Data Engineer - Database (Remote): Workera is looking for a data engineer to expand and optimize its transactional and analytical database design, work with analytics stakeholders to define database requirements, and create ETL and data-collection scripts for cross-functional teams. Apply here
Data Scientist - Psychometrics: Workera seeks a data scientist to develop a computer adaptive testing platform, analyze items, and leverage a large skills dataset to improve its assessment capabilities. Apply here
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|