Apr 7–8, 2025
Perimeter Institute for Theoretical Physics
America/Toronto timezone

Colloquium: Boltzmann Machines

Apr 7, 2025, 3:30 p.m.
1h
PI/1-100 - Theatre (Perimeter Institute for Theoretical Physics)

PI/1-100 - Theatre

Perimeter Institute for Theoretical Physics

190

Speaker

Geoffrey Hinton (University of Toronto)

Description

The standard way to do this is to use the chain rule to backpropagate gradients through layers of neurons. I shall briefly review a few of the engineering successes of backpropagation and then describe a very different way of getting the gradients that, for a while, seemed a lot more plausible as a model of how the brain gets gradients.

Consider a system composed of binary neurons that can be active or inactive with weighted pairwise couplings between pairs of neurons, including long range couplings. If the neurons represent pixels in a binary image, we can store a set of binary training images by adjusting the coupling weights so that the images are local minima of a Hopfield energy function which is minus the sum over all pairs of active neurons of their coupling weights. But this energy function can only capture pairwise correlations. It cannot represent the kinds of complicated higher-order correlations that occur in images. Now suppose that in addition to the "visible" neurons that represent the pixel intensities, we also have a large set of hidden neurons that have weighted couplings with each other and with the visible neurons. Suppose also that all of the neurons are asynchronous and stochastic: They adopt the active state with a log odds that is equal to the difference in the energy function when the neuron is inactive versus active. Given a set of training images, is there a simple way to set the weights on all of the couplings so that the training images are local minima of the free energy function obtained by integrating out the states of the hidden neurons? The Boltzmann machine learning algorithm solved this problem in an elegant way. It was proof of principle that learning in neural networks with hidden neurons was possible using only locally available information, contrary to what was generally believed at the time.

Presentation materials

There are no materials yet.

External references