Apr 9–11, 2025
Perimeter Institute for Theoretical Physics
America/Toronto timezone

Creativity by Compositionality in Generative Diffusion Models

Apr 9, 2025, 1:30 p.m.
45m
PI/4-400 - Space Room (Perimeter Institute for Theoretical Physics)

PI/4-400 - Space Room

Perimeter Institute for Theoretical Physics

48
Workshop Talk

Speaker

Alessandro Favero (École Polytechnique Fédérale de Lausanne)

Description

Diffusion models have shown remarkable success in generating high-dimensional data such as images and language – a feat only possible if data has strong underlying structure. Understanding deep generative models thus requires understanding the structure of the data they learn from. In particular, natural data is often composed of features organized hierarchically. In this talk, we will model this structure using probabilistic context-free grammars – tree-like generative models from linguistics. I will present a theory of denoising diffusion on this data, predicting a phase transition that governs the reconstruction of features at various hierarchical levels. I will show empirical evidence for it in both image and language diffusion models. I will then discuss how diffusion models learn these grammars, revealing a quantitative relationship between data correlations and the training set size needed to learn how to hierarchically compose new data. In particular, we predict a polynomial scaling of sample complexity with data dimension, providing a mechanism by which diffusion models avoid the curse of dimensionality. Additionally, this theory predicts that models trained on limited data generate outputs that are locally coherent but lack global consistency, an effect empirically confirmed across modalities. These results offer a new perspective on how generative models learn to become creative and compose novel data by progressively uncovering the latent hierarchical structure.

Presentation materials

There are no materials yet.

External references