Exact Sampling with Markov Chains

David Bruce Wilson

Ph.D. thesis, MIT Mathematics Department, May 28, 1996. (gzipped postscript)

Abstract: Random sampling has found numerous applications in computer science, statistics, and physics. The most widely applicable method of random sampling is to use a Markov chain whose steady state distribution is the probability distribution π from which we wish to sample. After the Markov chain has been run for long enough, its state is approximately distributed according to π. The principal problem with this approach is that it is often difficult to determine how long to run the Markov chain. In this thesis we present several algorithms that use Markov chains to return samples distributed exactly according to π. The algorithms determine on their own how long to run the Markov chain. Two of the algorithms may be used with any Markov chain, but are useful only if the state space is not too large. Nonetheless, a spin-off of these two algorithms is a procedure for sampling random spanning trees of a directed graph that runs more quickly than the Aldous/Broder algorithm. Another of the exact sampling algorithms presented here is much more efficient and may be used on Markov chains with huge state spaces, provided that they have a suitable special structure. Many Markov chains of practical interest have this structure, including some from statistical mechanics, and this exact sampling algorithm has already been implemented by the author and others. One example from physics is the random cluster model, which generalizes the Ising and Potts models. Of particular interest are the so-called critical exponents, which characterize the singular behavior of quantities such as the heat capacity and spontaneous magnetization at a certain critical temperature. In this thesis we also see how to use these sampling algorithms to generate (perfectly random) ``omnithermal'' random cluster samples. Given a single such sample, one can quickly read off a random Potts configuration at any prescribed temperature. We include preliminary results on the use of omnithermal samples to estimate the critical exponents.

Thesis supervisor: James G. Propp
Thesis committee: Persi Diaconis, David Karger, and James Propp

Remarks: This thesis is a melded version of the following four articles, together with some material that has not appeared elsewhere.