Lecture series "The quest for mathematical understanding of artificial intelligence", Professor Sanjeev Arora (Princeton University)
Videos for the lecture are now available!
Professor Sanjeev Arora
Lecture 1: Tour d’Horizon of Artificial Intelligence and Machine Learning today
- Date: Tuesday, 8 November 2022
- Time: 3.00 p.m. (JST)
Abstract: Can machines acquire capabilities that remind us of (or even exceed) general-purpose intelligent reasoning in humans? This question has animated research in computers since their invention in the first half of the 20th century. The past decade has seen dramatic progress in this direction, thanks very large deep neural network models, as well as a new generations of net architectures, algorithms, and training data sets. Machines have achieved super-human performance in a range of tasks. The lecture gives a broad and nontechnical overview of these new developments as well as questions (scientific, societal, ethical) raised by them.
Lecture 2: Deep Learning: Attempts toward mathematical understanding
- Date: Wednesday, 9 November 2022
- Time: 3.00 p.m. (JST)
Abstract: As described in Lecture 1, Deep Learning underlies many dramatic advances of the past decade. It involves training a massive neural net (aka deep net)—with billions or even trillions of trainable parameters—on very large datasets. Much of this field is empirical, and guided by good intuition. But there is also an emerging set of mathematical ideas for understanding the training process as well as properties of the trained nets. The lecture will give brief introductions to the frameworks of optimization, generalization, student-teacher setting, infinitely wide nets (NTK regime), and unsupervised learning. This lecture assumes some comfort with basic linear algebra, calculus and probability. It will allow nonexperts and students some mathematical insight into the field, the phenomena it is trying to understand, and some insights so far.
Lecture 3: What do we not understand mathematically about deep learning?
- Date: Thursday, 10 November 2022
- Time: 3:00 p.m. (JST)
Abstract: Deep learning has up-ended many traditional ways of thinking about machine learning and artificial intelligence. This lecture seeks to convey the challenges it poses for our prior mathematical understanding of machine learning, while building upon the vocabulary and concepts of Lecture 2. The focus is on attempts to cast light on the various mysteries of deep learning. Of special relevance is the startling ability of today’s very large deep nets to be quickly adapted to new tasks, which calls for quantifying the “skills” and “concepts” captured in the net’s parameters, something that remains nebulous from a mathematical viewpoint. We survey a growing number of attempts at mathematically peering into how the net evolves during training. Though focused on results from the past 3-4 years, the lecture should still be accessible to a broad scientific audience with math training at an undergraduate level.