Theory + AI Symposium

America/Toronto
PI/1-100 - Theatre (Perimeter Institute for Theoretical Physics)

PI/1-100 - Theatre

Perimeter Institute for Theoretical Physics

190
Description
 
As Perimeter enters its 25th year, we invite you to imagine what theoretical physics research will look like 25 years from now. On April 7 and 8, Perimeter will be hosting a symposium with speakers and panel discussions focusing on the promise of AI to accelerate progress in theoretical physics. These talks will address the possibilities and challenges associated with AI ‘doing science.’ The event will bring together physicists, engineers, AI researchers, and entrepreneurs to collect different perspectives on what the future of theoretical physics will look like, the engineering challenges we should expect along the way, what tools and collaborations will be needed to help get us there, and what exciting steps are already underway.
 
Confirmed Speakers:

  • Frank Cappello (Argonne National Laboratory) 

  • Yuri Chervonyi (Deep Mind) 

  • Ioana Ciuca (Stanford University) 

  • Deyan Ginev (LaTeXML) 

  • Geoffrey Hinton (University of Toronto)
  • Shirley Ho (Polymathic & Simons Foundation)
  • Vicky Kalogera (Northwestern University) 

  • Jared Kaplan* (Anthropic) 

  • Peter Koepke (University of Bonn) 

  • Roger Melko (University of Waterloo) 

  • Moritz Munchmeyer (University of Wisconsin–Madison) 

  • Axton Pitt (Litmaps) 

  • Xiaoliang Qi (Stanford University) 

  • Oleg Ruchayskiy (Niels Bohr Institute)

  • Gaurav Sahu (MILA) 

  • Steinn Sigurdsson (arXiv) 

  • Jesse Thaler (Massachusetts Institute of Technology) 

  • Stephen Wolfram* (Wolfram Research) 

  • Richard Zanibbi (Rochester Institute of Technology) 


*virtual presentation



Scientific Organizers:

  • Matthew Johnson (Perimeter Institute) 

  • Anindita Maiti (Perimeter Institute) 

  • Sabrina Pasterski (Perimeter Institute)


Advisory Committee:

  • Mykola Semenyakin (Perimeter Institute)

 

 
Perimeter Institute
Participants
  • A.W. Peet
  • Achim Kempf
  • Adam Ball
  • Adam Brown
  • Adam Sanderson
  • ADRIAN KHALIL LOPEZ RAVEN
  • Afshin Besharat
  • Ahmad Farhad Karimi
  • Aida Ahmadzadegan
  • Ajay Singh
  • Alex Blatov
  • Alex Buchel
  • Alex May
  • Alex Turzillo
  • Ali Hamed Moosavian
  • Ali SaraerToosi
  • Alice Chen
  • Alp Kutlualp
  • Amanda Ferneyhough
  • Amir Sadeghi
  • Amirhossein Dehghanizadeh
  • Ana-Maria Raclariu
  • Anindita Maiti
  • Anna Knorr
  • Anya Petrova
  • Artem Zhutov
  • Arthur Carvalho
  • Asimina Arvanitaki
  • Asimina Arvanitaki
  • Athanasios Kogios
  • Axton Pitt
  • Barbara Soda
  • Beka Modrekiladze
  • Benjamin MacLellan
  • Besart Lajci
  • Bessma Momani
  • Bessma Momani
  • Bianca Dittrich
  • Bindiya Arora
  • Biprateep Dey
  • Blake Bordelon
  • Brandon Zhao
  • Brian Batell
  • Bruno Giménez Umbert
  • Bruno Loureiro
  • Chenjie Wang
  • Chris Waddell
  • Christine Muschik
  • Christopher Bergevin
  • Craig Haney
  • Cunlu Zhou
  • Curt Jaimungal
  • Daniel Roy
  • David Bromley
  • Dawit Belayneh
  • Debbie Leung
  • Deyan Ginev
  • Dhruv Sondhi
  • Dmitry Krotov
  • Dongjin Lee
  • Dongxue Qu
  • Dorsa Sadat Hosseini
  • Duo Xu
  • Eivind Joerstad
  • Emiliia Dyrenkova
  • Emily Petroff
  • Encieh Erfani
  • Enrico Olivucci
  • Erick Arguello
  • Erik Schnetter
  • Estelle Inack
  • Evan Peters
  • Francisco Borges
  • Frank Cappello
  • Friederike Metz
  • Gabriel Pfaffman
  • Gaurav Sahu
  • Gaurav Saxena
  • Gebremedhin Dagnew
  • Geoffrey Hinton
  • Ghazal Geshnizjani
  • Goetz Hoeppe
  • hamid afshar
  • Hamid Jahed
  • Hanno Sahlmann
  • Haojun Qiu
  • Hassan Khalvati
  • He Wang
  • Helen Hatzis
  • Himanshu Sahu
  • Hossein Mohebbi
  • Huanqing Chen
  • Hugo Cui
  • Ildar Shar
  • ILIAS KOTSIREAS
  • Ioana Ciuca
  • Irvin Martinez
  • Jahed Abedi
  • James Mertens
  • Jangho Yang
  • Jared Kaplan
  • Jaume Gomis
  • Jaume Gomis
  • Jerome Quintin
  • Jesse Thaler
  • Jessica Muir
  • Jiaan Li
  • Jo Bovy
  • Jocelyn Read
  • Joel Brownstein
  • John Moffat
  • Jordan Krywonos
  • José Ramón Pareja Monturiol
  • Jun Liu
  • Jun Yong Khoo
  • Jury Radkovski
  • Kelly Wurtz
  • Kelly Wurtz
  • Kendrick Smith
  • Kevin Costello
  • Lauren Greenspan
  • Laurent Freidel
  • Lennart Nacke
  • Leonardo Almeida Lessa
  • Lewis Cole
  • Lijing Shao
  • Luca Brodoloni
  • Luca Ciambelli
  • Lucien Hardy
  • Luigi Di Marino
  • Luisa Lucie-Smith
  • Lukas Mueller
  • Maja Jablonska
  • Mansour Karami
  • Mansour Karami
  • Marco Costa
  • Marek Wartak
  • Marina Cortes
  • Mario Krenn
  • Matt Williams
  • Matthew Duschenes
  • Matthew Johnson
  • Matthew Sampson
  • Matthias Le Dall
  • Meenu Kumari
  • Megan Moss
  • Megan Titcomb
  • Meri Zaimi
  • Michael Solodko
  • Mike Walmsley
  • Ming Zhang
  • Mingyuan Zhang
  • Miriam Diamond
  • Mohamad Shalaby
  • Mohamed Hibat-Allah
  • Mohammad Kohandel
  • Mohammed Khalil
  • Mohsen Karkheiran
  • Moritz Munchmeyer
  • Mykola Semenyakin
  • Nabil Fahel
  • Nadie LiTenn
  • Nasser Mohammed
  • Neal Dalal
  • Neige Frankel
  • Ngoc Quy Hoang
  • Nhan Luong
  • Nigel Andersen
  • Nolan Koblischke
  • Oleg Ruchayskiy
  • Omar Hayat
  • Pablo Palacios
  • Paul Koidis
  • Pedro Ponte
  • Peter Donahue
  • Peter Koepke
  • Pierre-Antoine Bernard
  • Pranav Agarwal
  • Qiaoyin Pan
  • Rabindra Nath Das
  • Ramit Dey
  • Rene Meyer
  • Renee Hlozek
  • Riccardo Penco
  • Rich Buttigieg
  • Richard Zanibbi
  • Riley Krembil
  • Ro Jefferson
  • Robert Myers Myers
  • Robert Spekkens
  • Robin Swanson
  • Roger Melko
  • Roland Bittleston
  • Ruolin Liu
  • Sabrina Pasterski
  • Sabrina Sgandurra
  • Saeed Ghadimi
  • Samantha Buck
  • Sandeep Madireddy
  • Sanjit Shashi
  • Saranya Varakunan
  • Sebastian Wetzel
  • Sehmimul Hoque
  • Sepehr Rashidi
  • Sercan Husnugil
  • Sergey Sibiryakov
  • Sergio Sanjurjo
  • Seth Siegel
  • Severyn Balaniuk
  • Shadi Vandvajdi
  • Shammi Tahura
  • Shengqi Sang
  • Shirley Ho
  • Shiyu Zhou
  • shizhao zheng
  • Shujin Li
  • Shuwei Liu
  • Sid Goyal
  • Simon Li
  • Simone Speziale
  • Siyang Ling
  • Sotiris Mygdalas
  • Sriram Suryanarayan
  • Steinn Sigurdsson
  • Stephen Wolfram
  • Subhayan Sahu
  • Suroor Seher Gandhi
  • Suvendu Giri
  • Suzan Farhang-Sardroodi
  • SWAPNIL DUTTA
  • Syed Abbas Ahmad
  • Tareq Jaouni
  • Themistocles Zikopoulos
  • Thi Ha Kyaw
  • Timothy Hsieh
  • Tong Ou
  • Tony Salomone
  • Tudor Dimofte
  • Vichayuth Imchitr
  • Vicky Kalogera
  • Waleed Sherif
  • Wayne Chang
  • Weicheng Ye
  • Wenshuai Liu
  • William Donnelly
  • William East
  • Xiaoliang Qi
  • Yangrui Hu
  • Yeong-Cherng Liang
  • Yi Hong Teoh
  • Yilun Guan
  • Yoni Kahn
  • Yousef Rohanizadegan
  • Yuki Yamashita
  • Yuqi Li
  • Yuri Chervonyi
  • Yurii Kvasiuk
  • Yuxuan Li
  • Zach Weiner
  • Zechuan Zheng
  • Zhuo-Yu Xian
    • 1:00 p.m.
      Registration
    • 1:45 p.m.
      Participants make their way into the Theatre
    • 1
      Panel Discussion
      Speakers: Shirley Ho (Polymathic), Vicky Kalogera (Northwestern University), Roger Melko (University of Waterloo), Jesse Thaler (MIT), Marcela Carena (Perimeter Institute)
    • 3:00 p.m.
      Break
    • 2
      Colloquium: Boltzmann Machines

      The standard way to do this is to use the chain rule to backpropagate gradients through layers of neurons. I shall briefly review a few of the engineering successes of backpropagation and then describe a very different way of getting the gradients that, for a while, seemed a lot more plausible as a model of how the brain gets gradients.

      Consider a system composed of binary neurons that can be active or inactive with weighted pairwise couplings between pairs of neurons, including long range couplings. If the neurons represent pixels in a binary image, we can store a set of binary training images by adjusting the coupling weights so that the images are local minima of a Hopfield energy function which is minus the sum over all pairs of active neurons of their coupling weights. But this energy function can only capture pairwise correlations. It cannot represent the kinds of complicated higher-order correlations that occur in images. Now suppose that in addition to the "visible" neurons that represent the pixel intensities, we also have a large set of hidden neurons that have weighted couplings with each other and with the visible neurons. Suppose also that all of the neurons are asynchronous and stochastic: They adopt the active state with a log odds that is equal to the difference in the energy function when the neuron is inactive versus active. Given a set of training images, is there a simple way to set the weights on all of the couplings so that the training images are local minima of the free energy function obtained by integrating out the states of the hidden neurons? The Boltzmann machine learning algorithm solved this problem in an elegant way. It was proof of principle that learning in neural networks with hidden neurons was possible using only locally available information, contrary to what was generally believed at the time.

      Speaker: Geoffrey Hinton (University of Toronto)
    • 3
      Opening Remarks
    • 4
      EAIRA: Establishing a methodology to evaluate LLMs as research assistants.

      Recent advancements have positioned Large Language Models (LLMs) as transformative tools for scientific research, capable of addressing complex tasks that require reasoning, problem-solving, and decision-making. Their exceptional capabilities suggest their potential as scientific research assistants, but also highlight the need for holistic, rigorous, and domain-specific evaluation to assess effectiveness in real-world scientific applications. This talk describes a multifaceted methodology for Evaluating AI models as scientific Research Assistants (EAIRA) developed at Argonne National Laboratory.

      This methodology incorporates four primary classes of evaluations. 1) Multiple Choice Questions to assess factual recall; 2) Open Response to evaluate advanced reasoning and problem-solving skills; 3) Lab-Style Experiments involving detailed analysis of capabilities as research assistants in controlled environments; and 4) Field-Style Experiments to capture researcher-LLM interactions at scale in a wide range of scientific domains and applications. These complementary methods enable a comprehensive analysis of LLM strengths and weaknesses with respect to their scientific knowledge, reasoning abilities, and adaptability. Recognizing the rapid pace of LLM advancements, we designed the methodology to evolve and adapt so as to ensure its continued relevance and applicability. This talk describes the current methodology's state. Although developed within a subset of scientific domains, the methodology is designed to be generalizable to a wide range of scientific domains.

      Speaker: Frank Cappello (Argonne National Laboratory)
    • 5
      State of AI Reasoning for Theoretical Physics - Insights from the TPBench Project

      The newest large-language reasoning models are for the first time powerful enough to perform mathematical reasoning in theoretical physics at graduate level. In the mathematics community, data sets such as FrontierMath are being used to drive progress and evaluate models, but theoretical physics has so far received less attention. In this talk I will present our dataset TPBench (arxiv:2502.15815, tpbench.org), which was constructed to benchmark and improve AI models specifically for theoretical physics. We find extremely rapid progress of models over the last months, but also significant challenges at research level difficulty. I will also briefly outline strategies to improve these models for theoretical physics.

      Speaker: Moritz Munchmeyer (University of Wisconsin–Madison)
    • 6
      UniverseTBD: Democratising Science with AI & Why Stories Matter

      UniverseTBD is an interdisciplinary community of astronomers, AI researchers, engineers, artists and enthusiasts aligned on a bold mission to democratise Science for everyone. From releasing the first large language model in Astronomy, AstroLLaMA-1, to the AI-enabled literature discovery tool Pathfinder, and through our research with AstroPT and HypoGen, our team has pushed the boundaries of AI for Science for the past two years. In this talk, I discuss for the first time how UniverseTBD came to be, our vision, our values, and what drives us and has enabled us to scale our team projects in our commitment to share our learnings with the broader scientific community. I also briefly discuss our latest results with hypothesis generation (HypoGen), multimodal language models (AstroLlaVA-1) and agentic AI (AstroCoder). I conclude with a vision for the future where AI teams up with human researchers to "help us understand the Universe".

      Speaker: Ioana Ciuca (Stanford University)
    • 7
      Panel Discussion: Foundation Models for Theoretical Physics (Physicists)
    • 10:30 a.m.
      Break
    • 8
      arXiv: AI and Physics past, present and future

      The rise of AI, in particular recent LLM based tools, has had an immediate impact on the production of physics, in ways both good and bad. I discuss some of the impact seen on arXiv in particular, what the status quo and prospects are, and speculate on the longer term impact.

      Speaker: Steinn Sigurdsson (arXiv)
    • 9
      LaTeXML and the Math-rich Scholarly Web

      This short talk will outline some of LaTeXML's uses as infrastructure, as well as its enabling effect for search, AI, assistive technologies and the mobile web.

      We have been on a journey towards scholarly articles with web-native mathematics since the dawn of the internet. The physics Open Science movement has led the way along with LaTeX, its authoring framework of choice. NIST's LaTeXML is a conversion tool that in the last twenty years has increasingly bridged that gap.

      Speaker: Deyan Ginev (LaTeXML)
    • 10
      Searching Graphics and Text in Technical Documents: A Brief Overview and Plan

      What would effective and usable tools for searching text and graphics in research papers look like? In this talk we sketch a partial answer to this question, with reference to recent work in the Document and Pattern Recognition Lab at RIT. Two multimodal paper search prototypes, one for math (MathDeck) and one for chemistry (ReactionMiner search) will be used for illustration. A simple framework based on 'jars' of available information sources can organize and relate the actions performed by people and automated systems when retrieving, analyzing, and synthesizing sources. We will organize our answer sketch around this framework, and share open questions and research opportunities related to enhancing multi-modal search tools for expert and non-expert users.

      Note: ReactionMiner was developed in collaboration with NCSA and the Han lab at the University of Illinois, Urbana-Champaign.
      MathDeck demo: https://people.rit.edu/ma5339/mathdeck_landing
      ReactionMiner search demo: https://reactionminer.platform.moleculemaker.org/home

      Biography:

      Richard Zanibbi is a Professor of Computer Science at the Rochester Institute of Technology (RIT, USA) where he directs the Document and Pattern Recognition Lab (dprl@RIT). His research focuses upon the recognition and retrieval of graphical notations, particularly for mathematics and chemistry. He is also a member of the Molecule Maker Lab Institute (MMLI), one of the first NSF AI Centers. He received his PhD from Queen's University (Canada), and was an NSERC Postdoctoral Fellow at the Centre for Pattern Recognition and Machine Learning (CENPARMI) at Concordia University before joining RIT.

      Speaker: Richard Zanibbi (Rochester Institute of Technology)
    • 11
      Natural Proof Checking and AI

      From the start of AI, mathematical theorem proving has been an important challenge and technique. We sketch the topics of Automated and Interactive Theorem Proving and present the checking of naturally readable mathematical texts in the Naproche proof system. This involves translations between informal, semi-formal and formal mathematical languages and shows great potential for the use of new AI techniques.

      Speaker: Peter Koepke (University of Bonn)
    • 12
      Panel Discussion: Processing the Data of Theoretical Physics (Engineers)
    • 12:30 p.m.
      Lunch
    • 13
      Accelerating Discovery: Mapping the Future of AI-Enhanced Theoretical Physics

      This talk explores how artificial intelligence could transform theoretical physics over the next 25 years by addressing the crucial challenge of navigating an increasingly complex scientific literature landscape. We introduce Litmaps, a platform leveraging AI and visualization techniques to accelerate literature discovery and insights.
      We illustrate Litmaps' current capabilities in rapidly identifying relevant connections and advancing theoretical research. We also outline critical engineering challenges, including open access to historical literature, data standardization, and managing uncertainty in AI models.
      Finally, we highlight the importance of collaboration among physicists, AI researchers, engineers, and entrepreneurs, to realise the AI-enhanced future of theoretical physics research.

      Speaker: Axton Pitt (Litmaps)
    • 14
      Teaching and Mentoring the AI Scientists

      In the past two years, the LLM has made significant progress in math and reasoning, but it has not been applied widely in scientific research tasks. In this talk I will give a brief introduction to our on-going efforts on building the first AI scientist platform, where all researchers in different fields can contribute to teaching the AI scientists via contributing benchmarks and contributing specialized tools. We believe that by providing AI with the real-time updates of benchmarks and research tools, we are starting to enter an era with innovation driven by new types of human-AI collaboration.

      Speaker: Xiaoliang Qi (Stanford University)
    • 15
      Beyond Articles: Three Pillars of Scientific Transformation

      Scientific research is facing mounting challenges: overwhelmed reviewers, fragmented expertise, an outdated and inefficient system for allocating resources, and dissemination tools that no longer match the complexity and scale of modern scientific output. In this talk, I speak both as a researcher and as the founder of a successful startup to explore what a more coherent, future-ready scientific ecosystem could look like. Drawing on real prototypes and emerging tools, I’ll outline how we might reshape the way we publish, collaborate, and share not just to fix what’s broken, but to unlock what science could become tomorrow.

      Speaker: Oleg Ruchayskiy (Niels Bohr Institute)
    • 16
      Panel Discussion: Startups Accelerating Theoretical Physics (Entrepreneurs)
    • 17
      Keynote Q&A
      Speaker: Stephen Wolfram (Wolfram Research)
    • 3:00 p.m.
      Break
    • 18
      Human Level AI by 2030
      Speaker: Jared Kaplan (Anthropic)
    • 19
      Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2

      We present AlphaGeometry2, a significantly improved version of AlphaGeometry introduced in Trinh et al. (2024), which has now surpassed an average gold medalist in solving Olympiad geometry problems. To achieve this, we first extend the original AlphaGeometry language to tackle harder problems involving movements of objects, and problems containing linear equations of angles, ratios, and distances. This, together with support for non-constructive problems, has markedly improved the coverage rate of the AlphaGeometry language on International Math Olympiads (IMO) 2000-2024 geometry problems from 66% to 88%. The search process of AlphaGeometry2 has also been greatly improved through the use of Gemini architecture for better language modeling, and a novel knowledge-sharing mechanism that enables effective communication between search trees. Together with further enhancements to the symbolic engine and synthetic data generation, we have significantly boosted the overall solving rate of AlphaGeometry2 to 84% for all geometry problems over the last 25 years, compared to 54% previously. AlphaGeometry2 was also part of the system that achieved silver-medal standard at IMO 2024 this https URL. Last but not least, we report progress towards using AlphaGeometry2 as a part of a fully automated system that reliably solves geometry problems directly from natural language input.

      Speaker: Yuri Chervonyi (Deep Mind)
    • 20
      LitLLMs, LLMs for Literature Review: Are We There Yet?

      Literature reviews are an essential component of scientific research, but they remain time-intensive and challenging to write, especially due to the recent influx of research papers. In this talk, we will explore the zero-shot abilities of recent Large Language Models (LLMs) in assisting with the writing of literature reviews based on an abstract. We will decompose the task into two components: 1. Retrieving related works given a query abstract, and 2. Writing a literature review based on the retrieved results. We will then analyze how effective LLMs are for both components. For retrieval, we will discuss a novel two-step search strategy that first uses an LLM to extract meaningful keywords from the abstract of a paper and then retrieves potentially relevant papers by querying an external knowledge base. Additionally, we will study a prompting-based re-ranking mechanism with attribution and show that re-ranking doubles the normalized recall compared to naive search methods, while providing insights into the LLM's decision-making process. We will then discuss the two-step generation phase that first outlines a plan for the review and then executes steps in the plan to generate the actual review. To evaluate different LLM-based literature review methods, we create test sets from arXiv papers using a protocol designed for rolling use with newly released LLMs to avoid test set contamination in zero-shot evaluations. We will also see a quick demo of LitLLM in action towards the end.

      Speaker: Gaurav Sahu (MILA)
    • 21
      Panel Discussion: Harnessing Breakthroughs in Big Tech (AI Researchers)