Monte Carlo, and Other Kinds of Stochastic Simulation

Last update: 01 May 2025 22:47
First version: 1 March 2003

Monte Carlo is an approximation procedure. The basic idea is as follows. You want to know the average value of some random variable. You can't work out what its distribution is, exactly, or you don't want to do integrals numerically, but you can take samples from that distribution. (The random variable may, for instance, be some complicated function of variables with simple distributions, or they distribution may have a hard-to-compute normalizing factor ["partition function" in statistical mechanics].) To approximate it, you simply take samples, independently, and average them. If you take enough samples, then the law of large numbers says your average must be close to the true value. The central limit theorem says that your average has a Gaussian distribution around the true value.

Here's one of the canonical examples. Say you want to measure the area of a shape with a complicated, irregular outline. The Monte Carlo approach is to draw a square around the shape and measure the square. Now you throw darts into the square, as uniformly as possible. The fraction of darts falling on the shape gives the ratio of the area of the shape to the area of the square. Now, in fact, you can cast almost any integral problem, or any averaging problem, into this form. So you need a good way to tell if you're inside the outline, and you need a good way to figure out how many darts you should throw. Last but not least, you need a good way to throw darts uniformly, i.e., a good random number generator. That's a whole separate art I shan't attempt to describe (see Devroye instead).

Now, in fact, you don't strictly need to sample independently. You can have dependence, so long as you end up visiting each point just as many times as you would with independent samples. This is useful, since it gives a way to exploit properties of Markov chains in designing your sampling strategy, and even of speeding up the convergence of your estimates to the true averages. (The classic instance of this is the Metropolis-Hastings algorithm, which gives you a way of sampling from a distribution where all you have to know is the ratio of the probability densities at any two points. This is extremely useful when, as in many problems in statistics and statistical mechanics, the density itself contains a complicated normalizing factor; that drops out of the ratio.)

Monte Carlo methods originated in physics, where the integrals desired involved hydrodynamics in complicated geometries with internal heating, i.e., designing nukes. The statisticans were surprisingly slow to pick up on it, though by now they have, especially as "Markov chain Monte Carlo," abbreviated "MC Monte Carlo" (suggesting an gambling rapper) or just "MCMC". Along the way they picked up the odd idea that Monte Carlo had something to do with Bayesianism. In fact it's a general technique for estimating sample distributions and related quantities, and as such it's entirely legitimate for frequentists. Physicists now sometimes use the term for any kind of stochastic estimation or simulation procedure, though I think it's properly reserved for estimating integrals and averages.

Ergodic Theory of Markov and Related Processes
Filtering and State Estimation (for particle filtering)
Large Deviations
Markov Models
Simulation-Based Inference
Statistical Mechanics
Statistical Emulators for Simulation Models
Statistics
Stochastic Approximation

J. M. Hammersley and D. C. Handscomb, Monte Carlo Methods [Now-classic work from 1964. No longer recommendable for specific techniques, but still good as a general orientation.]
Cristopher Moore and Stephan Mertens, The Nature of Computation [Cris and Stephan were kind enough to let me read this in manuscript; it's magnificent. Review: Intellects Vast and Warm and Sympathetic]
Radford M. Neal, "Probabilistic Inference using Markov Chain Monte Carlo Methods" [full text in various formats]
Mark E. J. Newman and G. T. Barkema, Monte Carlo Methods in Statistical Physics [Excellent, but presumes a pretty strong background in statistical mechanics]
Numerical Recipes [As usual, NR has a decent explanation, and the source-code they provide is fine for many problems, though for more advanced work you need to look elsewhere.]

Roland Assaraf and Michel Caffarel, "Zero-Variance Principle for Monte Carlo Algorithms," Physical Review Letters 83 (1999): 4682--4685
Pierre Brémaud, Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues
Luc Devroye, Non-Uniform Random Variate Generation
Randal Douc and Eric Moulines, "Limit theorems for weighted samples with applications to Sequential Monte Carlo Methods", math.ST/0507042
Arnaud Doucet, Nando De Freitas and Neil Gordon (eds.), Sequential Monte Carlo Methods in Practice
Andrew Gelman and Donald B. Rubin, "Inference from Iterative Simulation Using Multiple Sequences", Statistical Science 7 (1992): 457--472
Charles J. Geyer
- "Practical Markov Chain Monte Carlo", Statistical Science 7 (1992): 473--483
- One Long Run
- Burn-In is Unnecessary
- On the Bogosity of MCMC Diagnostics
Peter Grassberger, "Go with the Winners: a General Monte Carlo Strategy,"Computer Physics Communications 147 (2002): 64--70, cond-mat/0201313
Herman Kahn, "Use of Different Monte Carlo Sampling Techniques", in H. A. Meyer (ed.), Symposion on the Monte Carlo Method [PDF preprint from 1955]
Subhash R. Lele, Brian Dennis, Frithjof Lutscher, "Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods", Ecology Letters 10 (2007): 551--563 [PDF reprint via Prof. Lutscher. There is nothing specifically ecological about this technique]
Willie Neiswanger, Chong Wang, Eric Xing, "Asymptotically Exact, Embarrassingly Parallel MCMC", arxiv:1311.4780
Donald B. Rubin, "The Calculation of Posterior Distributions by Data Augmentation: Comment: A Noniterative Sampling/Importance Resampling Alternative to the Data Augmentation Algorithm for Creating a Few Imputations When Fractions of Missing Information Are Modest: The SIR Algorithm", Journal of the American Statistical Association 82 (1987): 543--546 [JSTOR]
Yun Ju Sung, Charles J. Geyer, "Monte Carlo likelihood inference for missing data models", Annals of Statistics 35 (2007): 990--1011, arxiv:0708.2184

My lecture notes on Monte Carlo and Markov Chains and on Markov Chain Monte Carlo
CRS, "Evaluating Posterior Distributions by Selectively Breeding Prior Samples", arxiv:2203.09077

Rosalind J. Allen, Patrick B. Warren and Pieter Rein ten Wolde, "Sampling Rare Switching Events in Biochemical Networks", q-bio.MN/0406006 = Physical Review Letters 94 (2005): 018104
Christophe Andrieu, Arnaud Doucet and Roman Holenstein, "Particle Markov chain Monte Carlo methods", Journal of the Royal Statistical Society B 72 (2010): 269--342
Christophe Andrieu, Éric Moulines, "On the ergodicity properties of some adaptive MCMC algorithms", Annals of Applied Probability 16 (2006): 1462--1505, math.PR/0610317
Andriy Bandrivskyy, S. Beri, D. G. Luchinsky, R. Mannella, and P. V. E. McClintock, "Fast Monte Carlo simulations and singularities in the probability distributions of non-equilibrium systems," nlin.AO/0212038
Bernd A. Berg
- "Generalized Ensemble Simulations for Complex Systems," cond-mat/0110521
- "Introduction to Markov Chain Monte Carlo Simulations and their Statistical Analysis", cond-mat/0410490
M. J. Betancourt, "A General Metric for Riemannian Manifold Hamiltonian Monte Carlo", arxiv:1212.4693
Anthony Brockwell, Pierre Del Moral, and Arnaud Doucet, "Sequentially interacting Markov chain Monte Carlo methods", Annals of Statistics 38 (2010): 3387--3411
James A. Bucklew, "Conditional Importance Sampling Estimators", IEEE Transactions on Information Theory 51 (2005): 143--153
James A. Bucklew, Sirin Nitinawarat and Jay Wierer, "Universal Simulation Distributions", IEEE Transactions on Information Theory 50 (2004): 2674--2685
Simon Byrne, Mark Girolami, "Geodesic Monte Carlo on Embedded Manifolds", arxiv:1301.6064
Fabien Campillo and Vivien Rossi, "Parallel and interacting Markov chains Monte Carlo method", math.PR/0610181
Olivier Cappé, Randal Douc, Arnaud Guillin, Jean-Michel Marin, Christian P. Robert, "Adaptive Importance Sampling in General Mixture Classes", Statistics and Computing 18 (2008) 447--459, arxiv:0710.4242
A. C. Carter, Alan J. Bray and M. A. Moore, "On the Use of Finite-Size Scaling to Measure Spin-Glass Exponents," cond-mat/0302207
Fergal P. Casey, Joshua J. Waterfall, Ryan N. Gutenkunst, Christopher R. Myers, James P. Sethna, "Variational method for estimating the rate of convergence of Markov Chain Monte Carlo algorithms", physics/0609001
Hock Peng Chan, Tze Leung Lai, "A sequential Monte Carlo approach to computing tail probabilities in stochastic models", Annals of Applied Probability 21 (2011): 2315--2342, arxiv:1202.4582
Yuguo Chen, "Another look at rejection sampling through importance sampling", Statistics and Probability Letters 72 (2005): 277--283 ["We show that rejection sampling is inferior to the importance sampling algorithm in terms of the \chi^2 distance between the proposal distribution and the target distribution..."]
Andrew M. Childs, Ryan B. Patterson and David J. C. MacKay, "Exact Sampling from Non-Attractive Distributions Using Summary States," cond-mat/0005132
Nicolas Chopin, Pierre Jacob, "Free energy Sequential Monte Carlo, application to mixture modelling", arxiv:1006.3002
C. I. Chou, Rongsheng Han, S. P. Li and T. K. Leem "Guided Simulated Annealing Method for Optimization Problems," cond-mat/0302137
Francis Comets, Roberto Fernandez and Pablo A. Ferrari, "Processes with Long Memory: Regenerative Construction and Perfect Simulation," math.PR/0009204
Thomas Dean, Paul Dupuis, "Splitting for Rare Event Simulation: A Large Deviation Approach to Design and Analysis", arxiv:0711.2037
A. B. Deeker and M. Mandjes, "On asymptotically efficient simulation of large deviation probabilities", Advances in Applied Probability 37 (2005): 539--552
Pierre Del Moral, Mean Field Simulation for Monte Carlo Integration
Pierre Del Moral and Arnaud Doucet, "Interacting Markov chain Monte Carlo methods for solving nonlinear measure-valued equations", Annals of Applied Probability 20 (2010): 593--639
Pierre Del Moral, Josselin Garnier, "Genealogical particle analysis of rare events", Annals of Applied Probability 15 (2005): 2496--2534, arxiv:math/0602525
Paul Dupuis and Hui Wang, "Dynamic importance sampling for uniformly recurrent Markov chains", Annals of Applied Probability 15 (2005): 1--38, math.PR/0503454
Tilman Enss, Malte Henkel, Alan Picone and Ulrich Schollwöck, "Ageing phenomena without detailed balance: the contact process", cond-mat/0406147 [Abstract: "The long-time dynamics of the 1D contact process suddenly brought out of an uncorrelated initial state is studied through a light-cone transfer-matrix renormalisation group approach. At criticality, the system undergoes ageing which is characterised through the dynamical scaling of the two-times autocorrelation and autoresponse functions. The observed non-equality of the ageing exponents a and b excludes the possibility of a finite fluctuation-dissipation ratio in the ageing regime. The scaling form of the critical autoresponse function is in agreement with the prediction of local scale-invariance." This approach is supposedly an alternative to some kinds of Monte Carlo]
Daniel Egloff, "Monte Carlo Algorithms for Optimal Stopping and Statistical Learning", math.PR/0408276
Jean-David Fermanian and Bernard Salanié "A Nonparametric Simulated Maximum Likelihood Estimation Method", Econometric Theory 20 (2004): 701--734
James Allen Fill and Mark Huber, "The Randomness Recycler: A New Technique for Perfect Sampling," math.PR/0009242
James M. Flegal, Murali Haran, Galin L. Jones, "Markov Chain Monte Carlo: Can We Trust the Third Significant Figure?", Statistical Science 23 (2008): 250--260, math.ST/0703746
Cheng-Der Fuh, Huei-Wen Teng, Ren-Her Wang, "Efficient Importance Sampling for Rare Event Simulation with Applications", arxiv:1302.0583
Axel Gandy and Patrick Rubin-Delanchy, "An algorithm to compute the power of Monte Carlo tests with guaranteed precision", Annals of Statistics 41 (2013): 125--142, arxiv:1110.1248
Steven T. Garren and Richard L. Smith, "Estimating the second largest eigenvalue of a Markov transition matrix", Bernoulli 6 (2000): 215--242
W. R. Gilks et al, Markov Chain Monte Carlo in Practice
Mark Girolami, Ben Calderhead, Siu A. Chin, "Riemannian Manifold Hamiltonian Monte Carlo", arxiv:0907.1100
Alexander K. Hartmann, "Sampling rare events: statistics of local sequence alginments," cond-mat/0108201
Fred J. Hickernell, Lan Jiang, Yuewei Liu, Art Owen, "Guaranteed Conservative Fixed Width Confidence Intervals Via Monte Carlo Sampling", arxiv:1208.4318
Mark L. Huber, "Approximation algorithms for the normalizing constant of Gibbs distributions", arxiv:1206.2689
Mark Jerrum, "On the approximation of one Markov chain by another", Probability Theory and Related Fields 135 (2006): 1--14 [with special reference to MCMC]
Michael Kastner, "Monte Carlo methods in statistical physics: Mathematical foundations and strategies", arxiv:0906.0858
Ioannis Kontoyiannis and S. P. Meyn, "Computable exponential bounds for screened estimation and simulation", Annals of Applied Probability 18 (2008): 1491--1518, arxiv:math/0612040
A. Lecchini-Visintini, J. Lygeros, J. Maciejowski, "Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains", 0709.2989
Qiang Liu, Jian Peng, Alexander Ihler and John Fisher III, "Estimating the Partition Function by Discriminance Sampling", UAI 2015
Ying Liu, Andrew Gelman, Tian Zheng, "Simulation-efficient shortest probability intervals", arxiv:1302.2142
Neal Madras and Dana Randall, "Markov chain decomposition for convergence rate analysis", Annals of Applied Probability 12 (2002): 581--606
Indrek Mandre, Jaan Kalda, "Efficient method of finding scaling exponents from finite-size Monte-Carlo simulations", European Physical Journal B 86 (2013): 56, arxiv:1303.0294
J. D. Munoz, M. A. Novotny and S. J. Mitchell, "Rejection-free Monte Carlo algorithms for models with continuous degrees of freedom," Physical Review E 67 (2003): 026101
K. P. N. Murthy, "Monte Carlo: Basics," cond-mat/0104215
Radford M. Neal
- "Slice Sampling," physics/0009028
- "The Short-Cut Metropolis Method", math.ST/0508060
Ronald C. Neath, "On Convergence Properties of the Monte Carlo EM Algorithm", arxiv:1206.4768
M. A. Novotny, "A Tutorial on Advanced Dynamic Monte Carlo Methods for Systems with Discrete State Spaces," cond-mat/0109182
Christian Robert and George Casella, Monte Carlo Statistical Methods
Gareth O. Roberts and Jeffrey S. Rosenthal, "General state space Markov chains and MCMC algorithms", math.PR/0404033
Sylvain Rubenthaler, Tobias Ryden and Magnus Wiktorsson, "Fast simulated annealing in $\R^d$ and an application to maximum likelihood estimation", math.PR/0609353
R. Y. Rubinstein, "A Stochastic Minimum Cross-Entropy Method for Combinatorial Optimization and Rare-event Estimation", Methodology and Computing in Applied Probability 7 (2005): 5--50
Tilman Sauer, "The Feynman Path Goes Monte Carlo," physics/0107010
Pat Scott, "Pippi — painless parsing, post-processing and plotting of posterior and likelihood samples", European Physical Journal Plus 127 (2012): 138, arxiv:1206.2245
Alexander Y. Shestopaloff, Radford M. Neal, "MCMC for non-linear state space models using ensembles of latent sequences", arxiv:1305.0320
Gabriel Stoltz, "Path Sampling with Stochastic Dynamics: Some New Algorithms", cond-mat/0607650
Dimiter Tsvetkov, Lyubomir Hristov, Ralitsa Angelova-Slavova, "On the convergence of the Metropolis-Hastings Markov chains", arxiv:1302.0654
Andreas Wipf, Statistical Approach to Quantum Field Theory
Bin Yu, "Density Estimation in the $L^{\infty}$ Norm for Dependent Data with Applications to the Gibbs Sampler", Annals of Statistics 21 (1993): 711--735