Dear friends,
A few weeks ago, the White House required that research papers funded by the U.S. government be available online promptly and freely by the end of 2025. Data that underlies those publications must also be made available.
Today, there are peer-reviewed journal papers, peer-reviewed conference papers, and non-peer-reviewed papers posted online directly by the authors. Journal articles tend to be longer and undergo peer review and careful revisions. In contrast, conference papers (such as NeurIPS, ICML and ICLR articles) tend to be shorter and less carefully edited, and thus they can be published more quickly. And papers published on arXiv aren’t peer reviewed, so they can be published and reach interested readers immediately.
Keep learning! Andrew
NewsSpotting Tax Cheats From OverheadTax dodgers can’t hide from AI — especially those who like to swim. What’s new: French tax authorities, which tax swimming pools according to their size because they increase a home’s property value, netted nearly €10 million using an automated system to identify unregistered pools, Le Parisien reported. Diving in: Developed by Google and Paris-based consultancy Capgemini, the system spots pools in a public database of aerial images. It then cross-checks them with land-registry data to determine whether they’re registered. France plans to roll it out nationwide this month.
Beneath the surface: At least 17 other European Union tax-collection agencies use AI for tasks that include identifying who should be audited, scraping taxpayer data from ecommerce sites, and powering chatbots that help taxpayers file. Last year, U.S. tax authorities implemented technology from Palantir that identifies fraud by analyzing tax returns, bank statements, property records, and social media activity. Why it matters: As AI analyzes every nook and cranny of an individual’s data trail, reluctant taxpayers will find it harder to avoid paying up.
The Geopolitics of GPUsThe U.S. government blocked U.S. makers of AI chips from selling to China, adding to existing sanctions that target Russia. What’s new: The Department of Commerce restricted sales of Nvidia’s and AMD’s most-advanced chips for training and running large AI models, Reuters reported. How it works: U.S. officials didn’t detail the specifics of the ban. Nvidia said it would stop selling its A100 and H100 graphics processing units (GPUs) to China. AMD said the action affects its MI250 GPU.
China’s reaction: “This violates the rules of the market economy, undermines the international economic and trade order, and disrupts the stability of global industrial and supply chains,” a foreign ministry spokesperson said. China hasn’t announced countermeasures, but some analysts anticipate that it will further increase funding to its domestic semiconductor sector. Why it matters: AI is increasingly intertwined with geopolitics. China has repeatedly stated its intention to achieve “AI supremacy” and outpace the U.S. China, however, is still largely reliant on imported semiconductors, so the U.S. ban could hobble its ambitions.
A MESSAGE FROM DEEPLEARNING.AIJoin us on September 28, 2022, for “Beyond Jupyter Notebooks: MLOps Environment Setup and First Deployment”! This live workshop will show you how to set up your computer to build and deploy machine learning applications so you can run your models in production environments.
Reading ReadersA smart news paywall is optimizing subscriptions without driving away casual readers by showing them come-ons subscribe. What’s new: The New York Times described Dynamic Meter, a machine learning system that decides how many free articles to provide to a given user before prompting them to register or subscribe. How it works: The newspaper’s data science team ran a randomized, controlled trial and found that delivering more pop-ups that ask readers to subscribe resulted in more subscriptions but fewer page views, while delivering fewer popups resulted in fewer subscriptions but greater page views. How it works: The New York Times’ data science team collected a dataset by running a randomized, controlled trial that tracked the behavior of registered — but not yet subscribed — users with various characteristics. Generally, delivering more pop-ups that asked them to subscribe resulted in more subscriptions but fewer page views (prior to subscribing), while delivering fewer popups resulted in fewer subscriptions but greater page views.
Behind the news: The Wall Street Journal, Switzerland’s Neue Zürcher Zeitung, and Germany’s Frankfurter Allgemeine Zeitung also use machine learning to maximize subscriptions. Attention to Rows and ColumnsTransformers famously require quadratically more computation as input size increases, leading to a variety of methods to make them more efficient. A new approach alters the architecture’s self-attention mechanism to balance computational efficiency with performance on vision tasks. What's new: Pale-Shaped self-Attention achieved good vision results while applying self-attention to a grid-like pattern of rows and columns within an image. Sitong Wu led the work with colleagues at Baidu Research, Chinese National Engineering Laboratory for Deep Learning Technology and Application, and Chinese Academy of Sciences. Key insight: Previous attempts to reduce the computational cost of self-attention include axial self-attention, in which a model divides an image into patches and applies self-attention to a single row or column at a time, and cross-shaped attention, which processes a combined row and column at a time. The pale-shaped version processes patches in a pattern of rows and columns (one meaning of “pale” is fence, evoking the lattice of horizontal rails and vertical pickets). This enables self-attention to extract large-scale features from a smaller portion of an image. How it works: The authors implemented their pale-shaped scheme in Pale Transformer, which processed an image through alternating convolutional layers and 2 or 16 transformer blocks. They trained it on ImageNet.
Results: The authors tested three variants of Pale Transformer, each with a different number of parameters: Pale-T (Tiny, 22 million parameters), Pale-S (Small, 48 million parameters), and Pale-B (Base, 85 million parameters). Each achieved better top-1 classification accuracy on ImageNet than competing convolutional neural networks and transformers of similar size. For example, Pale-B achieved state-of-the-art accuracy of 85.8 percent while the best competing model, VOLO-D2 (59 million parameters), scored 85.2 percent. Pale-B required somewhat more computation (15.6 gigaflops) than VOLO-D2 (14.1 gigaflops), but both required far less than a vision transformer with 86 million parameters (55.4 gigaflops). The authors also compared Pale-T against axial and cross-shaped attention. Pale-T achieved 83.4 percent accuracy on ImageNet. The same model with axial attention achieved 82.4 percent and, with cross-shaped attention, achieved 82.8 percent. Why it matters: This work suggests that there’s room to improve the transformer’s tradeoff between efficiency and performance by changing the way inputs are processed. We’re thinking: Will this team’s next project be beyond the pale?
Work With Andrew Ng
Senior Controller: Woebot Health seeks a controller to help drive the development of people, processes, technology, compliance, and reporting and ensure that data is available for decision-making. The ideal candidate has 10-plus years of experience. MBA and CPA preferred. Apply here
Financial Analyst: Woebot Health is looking for an analyst to analyze data for the directors and C-suite to support strategic planning and other projects. The ideal candidate has seven-plus years of experience in market research, business analysis, and project management. Apply here
Product Marketing Manager: DeepLearning.AI seeks a product marketing manager who can bring its products to life across multiple channels and platforms including social, email, and the web. The ideal candidate is a creative self-starter who can work collaboratively and independently to execute new ideas and projects, thrives in a fast-paced environment, and has a passion for AI and/or education. Apply here
Data Engineer (Latin America): Factored seeks top data engineers with experience in data structures and algorithms, operating systems, computer networks, and object-oriented programming. Experience with Python and excellent English skills are required. Apply here
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|