Dear friends,
Russian troops have invaded Ukraine, and the terrifying prospect of a war in Europe weighs on my mind. My heart goes out to all the civilians affected, and I hope we won’t see the loss of life, liberty, or property that many people fear.
Full disclosure: My early work on deep learning was funded by the U.S. Defense Research Projects Agency, or DARPA. Last week, Wired mentioned my early work on drone helicopters, also funded by DARPA. During the U.S.-Iraq war, when IEDs (roadside bombs) were killing civilians and soldiers, I spent time thinking about how computer vision can help robots that dispose of IEDs.
What may not be so apparent is that forces that oppose democracy and civil liberties also have access to AI technology. Russian drones have been found to contain parts made in the U.S. and Europe. I wouldn’t be surprised if they also contain open-source software that our community has contributed to. Despite efforts to control exports of advanced chips and other parts that go into AI systems, the prospects are dim for keeping such technology out of the hands of people who would use it to cause harm. So I see little choice but to make sure the forces of democracy and civil liberties have the tools they need to protect themselves.
Several organizations have come to the same conclusion, and they’ve responded by proposing principles designed to tread a fine line between developing AI’s capacity to confer advantage on the battlefield and blunting its potential to cause a catastrophe. For example, the United Nations has issued guidance that all decisions to take human life must involve human judgment. Similarly, the U.S. Department of Defense requires that its AI systems be responsible, equitable, traceable, reliable, and governable.
I support these principles. Still, I’m concerned that such guidelines, while necessary, aren’t sufficient to prevent military abuses. User interfaces can be designed to lead people to accept an automated decision — consider the pervasive “will you accept all cookies from this website?” pop-ups that make it difficult to do anything else. An automated system may comply technically with the U.N. guidance, but if it provides little context and time for its human operator to authorize a kill mission, that person is likely to do so without the necessary oversight or judgment.
While it’s important to establish high-level principles, they must be implemented in a way that enables people to make fateful decisions — perhaps the most difficult decisions anyone can make — in a responsible way. I think of the protocols that govern the use of nuclear weapons, which so far have helped to avoid accidental nuclear war. The systems involved must be subject to review, auditing, and civilian oversight. A plan to use automated weapons could trigger protocols to ensure that the situation, legality, and schedule meet strict criteria, and that the people who are authorized to order such use are clearly identified and held accountable for their decisions.
War is tragic. Collectively we’ve invented wondrous technologies that also have unsettling implications for warfare. Even if the subject presents only a menu of unpalatable options, let’s play an active role in navigating the tough choices needed to foster democracy and civil liberties.
Keep learning, Andrew
NewsHigh-Energy Deep Learning
Nuclear fusion technology, long touted as an unlimited source of safe, clean energy, took a step toward reality with a machine learning algorithm that molds the fuel in a reactor’s core. What’s new: Researchers at DeepMind and École Polytechnique Fédérale de Lausanne (EPFL) developed a reinforcement learning algorithm to manipulate hydrogen plasma — an extremely high-energy form of matter — into an optimal shape for energy production. How it works: Reactors that confine plasma in a chamber known as a tokamak generate energy by pushing its atoms so close together that they fuse. A tokamak uses powerful magnetic coils to compress the plasma, heating it to the neighborhood of 100 million degrees Celsius to overcome the electrostatic force that normally pushes them apart. The authors trained a reinforcement learning model to control the voltage of 19 magnetic coils in a small, experimental tokamak reactor, enabling them to shape the plasma in ways that are consistent with maintaining an ongoing fusion reaction.
Results: In experimental runs with the real-world reactor, a previous algorithm controlled the coils to form a preliminary plasma shape before handing off the task to the authors’ model. Plasma can't be observed directly, so the authors calculated its shape and position properties based on measurements of the magnetic field within the tokamak. In five separate experiments, the controller formed the plasma into distinct shapes, such as a conventional elongated shape and a prospective “snowflake” shape, within particular tolerances (2 centimeters root mean squared error for shape, 5 kiloamperes root mean squared error for current passing through the plasma). In a novel feat, the algorithm maintained two separate plasma droplets for 200 milliseconds. Behind the news: Conventional nuclear energy results from nuclear fission. Scientists have been trying to harness nuclear fusion since the 1950s. Yet no fusion reactor has generated more energy than it consumed. (The U.S. National Ignition Facility came the closest yet last year.) A growing number of scientists are enlisting machine learning to manage the hundreds of factors involved in sustaining a fusion reaction.
Why it matters: Plasma in a tokamak, which is several times hotter than the sun and reverts to vapor if its electromagnetic container falters, is continually in flux. This work not only shows that deep learning can shape it in real time, it also opens the door to forming plasma in ways that might yield more energy. The next challenge: Scale up to a reactor large enough to produce meaningful quantities of energy. We’re thinking: Fusion energy — if it ever works — would be a game changer for civilization. It’s thrilling to see deep learning potentially playing a key role in this technology.
Remote Meter ReaderIndustrial gauges are often located on rooftops, underground, or in tight spaces — but they’re not out of reach of computer vision. What’s new: The Okinawa startup LiLz Gauge provides a system that reads analog gauges and reports their output to a remote dashboard. The system is available in Japan and set to roll out globally in 2023.
Behind the news: AI increasingly enables inspectors to do their jobs at a distance. For instance, drones equipped with computer vision have been used to spot damage and deficiencies in buildings, dams, solar and wind farms, and power lines. Why it matters: Given the complexity of replacing some gauges, computer vision may be more cost effective than installing a smart meter. More broadly, industrial operations don’t necessarily need to replace old gear if machine learning can give it new life. Well-established machine learning approaches can be engineered to meet the needs of low-tech industries.
A MESSAGE FROM DEEPLEARNING.AILooking to prepare for Google’s TensorFlow Certificate exam? Gain the skills you need to build scalable AI-powered applications with the TensorFlow Developer Professional Certificate program! Enroll today
Scam DefinitelyRobocalls slip through smartphone spam filters, but a new generation of deep learning tools promises to tighten the net. What’s new: Research proposed fresh approaches to thwarting robocalls. Such innovations soon could be deployed in apps, IEEE Spectrum reported.
Behind the news: Many robocallers have outgrown the fixed phone numbers, obviously prerecorded messages, and “press-1” phone trees that were dead giveaways in the past, making it harder for recipients to recognize spam calls even after answering the phone.
Why it matters: Robocallers placed nearly 4 billion nuisance calls in the U.S. in January 2021. These numbers have hardly budged since 2019 despite government efforts to combat them. The problem is even worse elsewhere. In Brazil, the average user of one call-blocking app received more than one spam call daily. It’s unlikely that robocalls will ever disappear entirely, but machine learning could relegate them to the background, like email spam.
Fine-Tune Your Fine-TuningLet’s say you have a pretrained language model and a small amount of data to fine-tune it to answer yes-or-no questions. Should you fine-tune it to classify yes/no or to fill in missing words — both viable approaches that are likely to yield different results? New work offers a way to decide. What’s new: Yanan Zheng and collaborators at Beijing Academy of Artificial Intelligence, Carnegie Mellon University, DeepMind, Massachusetts Institute of Technology, and Tsinghua University proposed FewNLU, a method that compares fine-tuning algorithms in few-shot natural language understanding, or language comprehension tasks in which a model must learn from a few examples. They also provide a toolkit for optimizing fine-tuned performance. Key insight: Previous comparisons of fine-tuning algorithms used fixed hyperparameter values; the researchers chose values known to work with a particular algorithm and maintained them with other algorithms. But different combinations of algorithm and architecture require different hyperparameter values to achieve their optimal performance. So, to compare fine-tuning algorithms, it’s best to determine hyperparameter values separately for each combination. How it works: The authors compared various data-split strategies and hyperparameter values for different fine-tuning algorithms applied to DeBERTa and ALBERT. They fine-tuned the models on 64 labeled examples for each of seven tasks in the SuperGLUE benchmark (such as answering yes-or-no questions about a text passage or multiple-choice questions about causes of events) to find the best data-split strategy and most important hyperparameters. Then they compared fine-tuning algorithms using different values for the most important hyperparameters.
Results: Multi-Splits led to superior test performance on 4 of the 7 tasks, and it had the greatest correlation between validation and test performance on 5 of the 7 tasks. Changes in the prompt pattern led to the greatest standard deviation in performance across hyperparameters (average of 5.5 percent accuracy, compared to the next-highest, training order, at 2.0 percent), suggesting that it was the most important hyperparameter to optimize. Using Multi-Splits and the optimal hyperparameter values for each fine-tuning algorithm (specific to each model and task), PET, ADAPET, and P-tuning performed similarly and typically outperformed CLS by 15 to 20 percentage points in accuracy and F1 score. There was no clear winner among PET, ADAPET, and P-tuning, each of which achieved the highest accuracy or F1 score on one task or another, often within 1 standard deviation of each other. Why it matters: It’s certainly good to know how to get the most out of fine-tuning. Beyond that, this work reinforces the notion that, since the only way to know the best hyperparameter values is to find them empirically, it pays to keep guessing to a minimum. We’re thinking: Here’s a puzzler: If the choice of a fine-tuning algorithm changes a model’s optimal hyperparameter values, is the choice itself a hyperparameter?
Work With Andrew Ng
Data Analysts (Latin America): Factored seeks expert data analysts to analyze datasets for insights using descriptive modeling. Excellent English skills and a strong background in Python or R coding is required. Apply here
Data Engineer (Latin America): Factored is looking for top data engineers with experience with data structures and algorithms, operating systems, computer networks, and object-oriented programming. Experience with Python and excellent English skills required. Apply here
Senior Technical Program Manager: Landing AI seeks a program manager to bridge its team and business partners while executing its engineering programs. The ideal candidate has a strong background in customer relationship management and two years in a technical role. Apply here
Software Development Engineer (Latin America): Landing AI is looking for a software engineer with experience in best practices and proficiency in programming languages, as well as experience with end-to-end product development. In this role, you will help to design and develop infrastructure for machine learning services and deliver high-quality AI products to our clients. Apply here
Machine Learning Engineer: Workera is looking for an engineer to shape its product and create strategic advantages. You will build an intelligence layer critical to the value Workera offers its customers. Apply here
Data Scientist: Workera is looking for a data scientist to create unique value for users and enterprise clients. You will improve the company’s assessment capability, generate personalized learning plans, index content, and build other valuable applications. Apply here
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|