Apr 7–8, 2025
Perimeter Institute for Theoretical Physics
America/Toronto timezone

State of AI Reasoning for Theoretical Physics - Insights from the TPBench Project

Apr 8, 2025, 9:40 a.m.
10m
PI/1-100 - Theatre (Perimeter Institute for Theoretical Physics)

PI/1-100 - Theatre

Perimeter Institute for Theoretical Physics

190

Speaker

Moritz Munchmeyer (University of Wisconsin–Madison)

Description

The newest large-language reasoning models are for the first time powerful enough to perform mathematical reasoning in theoretical physics at graduate level. In the mathematics community, data sets such as FrontierMath are being used to drive progress and evaluate models, but theoretical physics has so far received less attention. In this talk I will present our dataset TPBench (arxiv:2502.15815, tpbench.org), which was constructed to benchmark and improve AI models specifically for theoretical physics. We find extremely rapid progress of models over the last months, but also significant challenges at research level difficulty. I will also briefly outline strategies to improve these models for theoretical physics.

Presentation materials

There are no materials yet.

External references