Dear friends,
Last week, I described trends that AI Fund, the venture studio I lead, has seen in building AI startups. I'd like to discuss another aspect of building companies that’s unique to AI businesses: the controversial topic of data moats.
A company has a data moat if its access to data makes it difficult for competitors to enter its business. Moat is a common business term used evoke the water-filled moats built around castles to make them easier to defend against attackers. For example, if a self-driving car company can acquire far more data than its competitors to train and test its system, and if this data makes a material difference in the system’s performance, then its business will be more defensible.
For a few years, some investors asked every AI startup’s founders about its data moat, as if they expected everyone to build one. But, like many things in AI, it depends. A data moat can provide protection, but its effectiveness varies depending on the specific circumstances of the business.
For instance, a data moat may not do much to protect an AI business if:
In contrast, data can make an AI business more defensible if:
Data strategy is important for AI companies, and thinking through how a system’s performance varies with the amount of data, the importance of fresh data, and other factors described above can help you decide how much having data adds to a business’ defensibility. Sometimes a data moat doesn't help at all. But in other cases, it's one pillar (hopefully among many) that makes it harder for competitors to catch up.
Keep learning! Andrew
News
GPT-FreeItching to get your hands on a fully trained large language model? The wait is over. How it works: The OPT architecture is similar to that of OpenAI’s GPT-3. The models were trained on publicly available datasets that include novels, news articles, Reddit posts, and a subset of The Pile.
Behind the news: OPT-175B is the largest and most ambitious open-source language model to date, but it’s not the first.
Yes, but: A parameter count of 175 billion parameters is mouthwatering, but it takes a lot of horsepower to drive a model that large. As Maarten Sap of the Allen Institute for Artificial Intelligence told IEEE Spectrum, “[I’d] love to use OPT-175B,” but “few research labs actually have the infrastructure to run this model.” Why it matters: For researchers — well, for anyone interested in language modeling, really — the opportunity is obvious. OPT comes pretrained, ready to be used, fine-tuned, dissected, or adapted for any purposes the AI community dreams up. No more APIs! No more paywalls! It’s your party, so indulge yourself. For Meta, open-sourcing these models may have several benefits. Giving away OPT is a community-minded gesture at a time when the company has been under fire for proliferating hatred, misinformation, and disinformation on a grand scale. It’s a bid to attract talent that could help break in young engineers to the company’s coding practices. And it’s a shot at OpenAI, the former nonprofit, open-source shop, which was criticized for keeping GPT-3’s code under wraps. We’re thinking: The OPT-175B training log offers a rare look at a large-scale machine learning project. While the mass media may imagine bespectacled programmers in airy, well-lit rooms debating the nature of intelligence, technology development is often messy as researchers struggle to visualize what an algorithm is doing or trace the source of a GPU crash. Worth a look!
Nurse’s Mechanical HelperHospitals are using robots to lighten the load on clinical staff. What’s new: A number of U.S. hospitals are using Moxi, a robot from Diligent Robotics, to ferry supplies, lab specimens, soiled laundry, and other items, Wired reported.
Behind the news: In 2020, the American Nurses Association assessed Moxi’s performance in three Texas hospitals. The study found that the robots improved nurse productivity and reduced feelings of burnout. However, the robots struggled to navigate crowded hospital halls, and their inability to read expiration dates raised the worry that they might contribute to adverse consequences. Why it matters: Robots may not have the best bedside manner (yet), but they can create much-needed breathing room for human caregivers. In a 2021 survey of U.S. nurses, 83 percent of respondents said their shifts were understaffed in a way that affected patients’ safety half of the time, and 68 percent had considered leaving the profession. Meanwhile, the U.S. is one of many countries with a rapidly growing population of elderly people, putting further strain on the healthcare system. These conditions create a clear opening for robots capable of performing many low-risk, repetitive chores. We’re thinking: Come to think of it, Hippocrates’ dictum “first, do no harm” bears a striking similarity to Asimov’s First Law of Robotics, “a robot may not harm a human being.”
A MESSAGE FROM DEEPLEARNING.AI
What’s new about the revised Machine Learning Specialization that’s set to launch in June? It takes the core curriculum — vetted by millions of learners — and makes it more approachable by balancing intuition, code, and math for beginners. Pre-enroll now
Hit PickerA neural network may help an online music service to spot songs with the potential to go big.
Behind the news: A number of companies offer AI-powered tools designed to enable recording companies, artists, and fans to squeeze more value out of music.
Why it matters: Millions of new songs are released every year. Amid the deluge, AI can help distributors recognize potential hits, recording companies identify talent, fans find music they like, and musicians create sounds that stand out. Of course, the makings of a hit include social dynamics among listeners — presumably that’s where acquirer SoundCloud comes in. We’re thinking: According to models, this edition of The Batch has moderate energy with high variance and a 72 percent chance of being powerful.
Image Generation + ProbabilitiesIf you want to both synthesize data and find the probability of any given example — say, generate images of manufacturing defects to train a defect detector and identify the highest-probability defects — you may use the architecture known as a normalizing flow. A new type of layer enables users to boost a normalizing flow’s performance by tuning it to their training data. What’s new: Gianluigi Silvestri at OnePlanet Research Center and colleagues at Google Research and Radboud University introduced the embedded-model flow (EMF). This architecture uses a probabilistic program — a user-defined probability distribution — to influence the training of a normalizing flow. Normalizing Flow basics: A normalizing flow (NF) is a generative architecture. Like a generative adversarial network (GAN), it learns to synthesize examples similar to its training data. Unlike a GAN, it also learns to calculate the likelihood of existing examples. During training, an NF transforms examples into noise. At inference, it runs in reverse to transform noise into synthetic examples. Thus it requires layers that can execute both forward and backward; that is, layers that are invertible as well as differentiable. Key insight: Like a normalizing flow layer, the cumulative distribution function (CDF), which is a function of a probability distribution, can be both differentiable and invertible. (In cases where this is not true, it’s possible to approximate the CDF’s derivative or inverse.) The CDF of a probability distribution can be used to compute that distribution, so it can be used to create a probabilistic program. Such a program, being differentiable and invertible, can be used in an NF, where it can transform a random vector to follow a probability distribution and vice versa. How it works: EMF is a normalizing flow composed of three normalizing flow layers and a user-defined probabilistic program layer. The authors used a dataset of handwritten digits to train the model to generate digits 0 through 9.
Results: The authors compared EMF with a baseline made up of a comparable number of normalizing flow layers. Generating examples in the test set, it achieved a negative log likelihood of 1260.8, while the baseline scored 1307.9 (lower is better). EMF outperformed similar baselines trained for other tasks. For instance, generating solutions to the differential equations for Brownian motion, it achieved a negative log likelihood of -26.4 compared to the baseline’s -26.1. Yes, but: A baseline with an additional normalizing flow layer achieved a better negative log likelihood (1181.3) for generating test-set digits. The authors explain that EMF may have underperformed because it had fewer parameters, although they don’t quantify the difference. Why it matters: Normalizing flows have their uses, but the requirement that its layers be invertible imposes severe limitations. By proposing a new layer type that improves their performance, this work makes them less forbidding and more useful. In fact, probabilistic programs aren’t difficult to make: They’re easy to diagram, and the authors offer an algorithm that turns such diagrams into normalizing flow layers. We’re thinking: The authors achieved intriguing results with a small model (three layers, compared to other work and dataset (10,000 examples compared to, say, ImageNet’s 1.28 million). We look forward to learning what EMF-style models can accomplish with more and wider layers, and with larger datasets like ImageNet.
Work With Andrew Ng
Full-Stack Software Engineer (Latin America): Landing AI is looking for an engineer with experience in software engineering, cloud-based development, frontend technologies and frameworks, and end-to-end product development. In this role, you’ll help design and develop infrastructure for machine learning services and deliver high-quality AI products to our clients. Apply here
Full-Stack Ruby On Rails/React Web Developer: ContentGroove is hiring a full-stack developer in North America to join its remote engineering team. This role will work with the product and design team to help define future features from a functional perspective. Join a fast-growing company with an outstanding executive team! Apply here
Technical Program Manager (Latin America): Landing AI seeks a customer-focused technical program manager to lead multidisciplinary programs across multiple time zones. In this role, you’ll exercise your skills in communication, risk mitigation, and project management to deliver timely, high-quality solutions. The ideal candidate has strong analytical and engineering background, willingness to get their hands dirty when necessary, and drive to solve complex and ambiguous problems. Apply here
Backend Data Engineer (Taipei): DeepLearning.AI seeks a backend data engineer with strong computer-science fundamentals and drive to improve learner experiences. In this role, you’ll execute early-stage development of an educational environment for AI-related topics. Apply here Frontend Engineer (Taipei): DeepLearning.AI is looking for a frontend engineer with strong computer-science fundamentals and drive to improve learner experiences. In this role, you’ll execute early-stage development of an educational environment for AI-related topics. Apply here
Data Engineer (Latin America): Factored seeks top data engineers with experience in data structures and algorithms, operating systems, computer networks, and object-oriented programming. Experience with Python and excellent English skills are required. Apply here
Software Development Engineer (Latin America): Landing AI is looking for a software engineer with proficiency in best practices, programming languages, and end-to-end product development. In this role, you’ll help to design and develop infrastructure for machine learning services and deliver high-quality AI products. Apply here
UX Designer: Landing AI seeks a UX designer who has experience with enterprise software and applications. In this role, you’ll be central to shaping the company’s products and design culture. Apply here
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|