Colloquium

Towards a Science of AI: Scaling laws and synthetic dataConfirmed

by Maissam Barkeshli (University of Maryland, College Park)

America/Toronto
PI/2-292 - Time Room (Perimeter Institute for Theoretical Physics)

PI/2-292 - Time Room

Perimeter Institute for Theoretical Physics

60
Description

The stunning capabilities of modern AI systems give rise to many questions regarding how they work and how much more capable they can possibly get. One way to gain additional insight is via synthetic models of data with tunable complexity, which can capture the basic relevant structures of real data. In recent work we have focused on sequences obtained from random walks on graphs, hypergraphs, and hierarchical graphical structures. I will present some recent empirical results regarding how transformers learn sequences arising from random walks on graphs. The focus will be on neural scaling laws, unexpected temperature-dependent effects, and sample complexity. If there is time, I will also discuss the effect of parameterization strategies on hyperparameter scaling laws, where we see the critical importance of appropriately scaling the embedding layer learning rate.

Organized by

Jaume Gomis