Monte Carlo, and Other Kinds of Stochastic Simulation
25 Nov 2024 10:30
Monte Carlo is an approximation procedure. The basic idea is as follows. You want to know the average value of some random variable. You can't work out what its distribution is, exactly, or you don't want to do integrals numerically, but you can take samples from that distribution. (The random variable may, for instance, be some complicated function of variables with simple distributions, or they distribution may have a hard-to-compute normalizing factor ["partition function" in statistical mechanics].) To approximate it, you simply take samples, independently, and average them. If you take enough samples, then the law of large numbers says your average must be close to the true value. The central limit theorem says that your average has a Gaussian distribution around the true value.
Here's one of the canonical examples. Say you want to measure the area of a shape with a complicated, irregular outline. The Monte Carlo approach is to draw a square around the shape and measure the square. Now you throw darts into the square, as uniformly as possible. The fraction of darts falling on the shape gives the ratio of the area of the shape to the area of the square. Now, in fact, you can cast almost any integral problem, or any averaging problem, into this form. So you need a good way to tell if you're inside the outline, and you need a good way to figure out how many darts you should throw. Last but not least, you need a good way to throw darts uniformly, i.e., a good random number generator. That's a whole separate art I shan't attempt to describe (see Devroye instead).
Now, in fact, you don't strictly need to sample independently. You can have dependence, so long as you end up visiting each point just as many times as you would with independent samples. This is useful, since it gives a way to exploit properties of Markov chains in designing your sampling strategy, and even of speeding up the convergence of your estimates to the true averages. (The classic instance of this is the Metropolis-Hastings algorithm, which gives you a way of sampling from a distribution where all you have to know is the ratio of the probability densities at any two points. This is extremely useful when, as in many problems in statistics and statistical mechanics, the density itself contains a complicated normalizing factor; that drops out of the ratio.)
Monte Carlo methods originated in physics, where the integrals desired involved hydrodynamics in complicated geometries with internal heating, i.e., designing nukes. The statisticans were surprisingly slow to pick up on it, though by now they have, especially as "Markov chain Monte Carlo," abbreviated "MC Monte Carlo" (suggesting an gambling rapper) or just "MCMC". Along the way they picked up the odd idea that Monte Carlo had something to do with Bayesianism. In fact it's a general technique for estimating sample distributions and related quantities, and as such it's entirely legitimate for frequentists. Physicists now sometimes use the term for any kind of stochastic estimation or simulation procedure, though I think it's properly reserved for estimating integrals and averages.
- See also:
- Filtering and State Estimation (for particle filtering)
- Large Deviations
- Markov Models
- Statistical Mechanics
- Statistical Emulators for Simulation Models
- Statistics
- Stochastic Approximation
- Recommended, big-picture:
- J. M. Hammersley and D. C. Handscomb, Monte Carlo Methods [Now-classic work from 1964. No longer recommendable for specific techniques, but still good as a general orientation.]
- Cristopher Moore and Stephan Mertens, The Nature of Computation [Cris and Stephan were kind enough to let me read this in manuscript; it's magnificent. Review: Intellects Vast and Warm and Sympathetic]
- Radford M. Neal, "Probabilistic Inference using Markov Chain Monte Carlo Methods" [full text in various formats]
- Mark E. J. Newman and G. T. Barkema, Monte Carlo Methods in Statistical Physics [Excellent, but presumes a pretty strong background in statistical mechanics]
- Numerical Recipes [As usual, NR has a decent explanation, and the source-code they provide is fine for many problems, though for more advanced work you need to look elsewhere.]
- Recommended, details:
- Roland Assaraf and Michel Caffarel, "Zero-Variance Principle for Monte Carlo Algorithms," Physical Review Letters 83 (1999): 4682--4685
- Pierre Brémaud, Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues
- Luc Devroye, Non-Uniform Random Variate Generation
- Randal Douc and Eric Moulines, "Limit theorems for weighted samples with applications to Sequential Monte Carlo Methods", math.ST/0507042
- Arnaud Doucet, Nando De Freitas and Neil Gordon (eds.), Sequential Monte Carlo Methods in Practice
- Andrew Gelman and Donald B. Rubin, "Inference from Iterative Simulation Using Multiple Sequences", Statistical Science 7 (1992): 457--472
- Charles J. Geyer
- "Practical Markov Chain Monte Carlo", Statistical Science 7 (1992): 473--483
- One Long Run
- Burn-In is Unnecessary
- On the Bogosity of MCMC Diagnostics
- Peter Grassberger, "Go with the Winners: a General Monte Carlo Strategy,"Computer Physics Communications 147 (2002): 64--70, cond-mat/0201313
- Herman Kahn, "Use of Different Monte Carlo Sampling Techniques", in H. A. Meyer (ed.), Symposion on the Monte Carlo Method [PDF preprint from 1955]
- Subhash R. Lele, Brian Dennis, Frithjof Lutscher, "Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods", Ecology Letters 10 (2007): 551--563 [PDF reprint via Prof. Lutscher. There is nothing specifically ecological about this technique]
- Willie Neiswanger, Chong Wang, Eric Xing, "Asymptotically Exact, Embarrassingly Parallel MCMC", arxiv:1311.4780
- Donald B. Rubin, "The Calculation of Posterior Distributions by Data Augmentation: Comment: A Noniterative Sampling/Importance Resampling Alternative to the Data Augmentation Algorithm for Creating a Few Imputations When Fractions of Missing Information Are Modest: The SIR Algorithm", Journal of the American Statistical Association 82 (1987): 543--546 [JSTOR]
- Yun Ju Sung, Charles J. Geyer, "Monte Carlo likelihood inference for missing data models", Annals of Statistics 35 (2007): 990--1011, arxiv:0708.2184
- Modesty forbids me to recommend:
- My lecture notes on Monte Carlo and Markov Chains and on Markov Chain Monte Carlo
- CRS, "Evaluating Posterior Distributions by Selectively Breeding Prior Samples", arxiv:2203.09077
- To read:
- Rosalind J. Allen, Patrick B. Warren and Pieter Rein ten Wolde, "Sampling Rare Switching Events in Biochemical Networks", q-bio.MN/0406006 = Physical Review Letters 94 (2005): 018104
- Christophe Andrieu, Arnaud Doucet and Roman Holenstein, "Particle Markov chain Monte Carlo methods", Journal of the Royal Statistical Society B 72 (2010): 269--342
- Christophe Andrieu, Éric Moulines, "On the ergodicity properties of some adaptive MCMC algorithms", Annals of Applied Probability 16 (2006): 1462--1505, math.PR/0610317
- Andriy Bandrivskyy, S. Beri, D. G. Luchinsky, R. Mannella, and P. V. E. McClintock, "Fast Monte Carlo simulations and singularities in the probability distributions of non-equilibrium systems," nlin.AO/0212038
- Bernd A. Berg
- "Generalized Ensemble Simulations for Complex Systems," cond-mat/0110521
- "Introduction to Markov Chain Monte Carlo Simulations and their Statistical Analysis", cond-mat/0410490
- M. J. Betancourt, "A General Metric for Riemannian Manifold Hamiltonian Monte Carlo", arxiv:1212.4693
- Anthony Brockwell, Pierre Del Moral, and Arnaud Doucet, "Sequentially interacting Markov chain Monte Carlo methods", Annals of Statistics 38 (2010): 3387--3411
- James A. Bucklew, "Conditional Importance Sampling Estimators", IEEE Transactions on Information Theory 51 (2005): 143--153
- James A. Bucklew, Sirin Nitinawarat and Jay Wierer, "Universal Simulation Distributions", IEEE Transactions on Information Theory 50 (2004): 2674--2685
- Simon Byrne, Mark Girolami, "Geodesic Monte Carlo on Embedded Manifolds", arxiv:1301.6064
- Fabien Campillo and Vivien Rossi, "Parallel and interacting Markov chains Monte Carlo method", math.PR/0610181
- Olivier Cappé, Randal Douc, Arnaud Guillin, Jean-Michel Marin, Christian P. Robert, "Adaptive Importance Sampling in General Mixture Classes", Statistics and Computing 18 (2008) 447--459, arxiv:0710.4242
- A. C. Carter, Alan J. Bray and M. A. Moore, "On the Use of Finite-Size Scaling to Measure Spin-Glass Exponents," cond-mat/0302207
- Fergal P. Casey, Joshua J. Waterfall, Ryan N. Gutenkunst, Christopher R. Myers, James P. Sethna, "Variational method for estimating the rate of convergence of Markov Chain Monte Carlo algorithms", physics/0609001
- Hock Peng Chan, Tze Leung Lai, "A sequential Monte Carlo approach to computing tail probabilities in stochastic models", Annals of Applied Probability 21 (2011): 2315--2342, arxiv:1202.4582
- Yuguo Chen, "Another look at rejection sampling through importance sampling", Statistics and Probability Letters 72 (2005): 277--283 ["We show that rejection sampling is inferior to the importance sampling algorithm in terms of the \chi^2 distance between the proposal distribution and the target distribution..."]
- Andrew M. Childs, Ryan B. Patterson and David J. C. MacKay, "Exact Sampling from Non-Attractive Distributions Using Summary States," cond-mat/0005132
- Nicolas Chopin, Pierre Jacob, "Free energy Sequential Monte Carlo, application to mixture modelling", arxiv:1006.3002
- C. I. Chou, Rongsheng Han, S. P. Li and T. K. Leem "Guided Simulated Annealing Method for Optimization Problems," cond-mat/0302137
- Francis Comets, Roberto Fernandez and Pablo A. Ferrari, "Processes with Long Memory: Regenerative Construction and Perfect Simulation," math.PR/0009204
- Thomas Dean, Paul Dupuis, "Splitting for Rare Event Simulation: A Large Deviation Approach to Design and Analysis", arxiv:0711.2037
- A. B. Deeker and M. Mandjes, "On asymptotically efficient simulation of large deviation probabilities", Advances in Applied Probability 37 (2005): 539--552
- Pierre Del Moral, Mean Field Simulation for Monte Carlo Integration
- Pierre Del Moral and Arnaud Doucet, "Interacting Markov chain Monte Carlo methods for solving nonlinear measure-valued equations", Annals of Applied Probability 20 (2010): 593--639
- Pierre Del Moral, Josselin Garnier, "Genealogical particle analysis of rare events", Annals of Applied Probability 15 (2005): 2496--2534, arxiv:math/0602525
- Paul Dupuis and Hui Wang, "Dynamic importance sampling for uniformly recurrent Markov chains", Annals of Applied Probability 15 (2005): 1--38, math.PR/0503454
- Tilman Enss, Malte Henkel, Alan Picone and Ulrich Schollwöck, "Ageing phenomena without detailed balance: the contact process", cond-mat/0406147 [Abstract: "The long-time dynamics of the 1D contact process suddenly brought out of an uncorrelated initial state is studied through a light-cone transfer-matrix renormalisation group approach. At criticality, the system undergoes ageing which is characterised through the dynamical scaling of the two-times autocorrelation and autoresponse functions. The observed non-equality of the ageing exponents a and b excludes the possibility of a finite fluctuation-dissipation ratio in the ageing regime. The scaling form of the critical autoresponse function is in agreement with the prediction of local scale-invariance." This approach is supposedly an alternative to some kinds of Monte Carlo]
- Daniel Egloff, "Monte Carlo Algorithms for Optimal Stopping and Statistical Learning", math.PR/0408276
- Jean-David Fermanian and Bernard Salanié "A Nonparametric Simulated Maximum Likelihood Estimation Method", Econometric Theory 20 (2004): 701--734
- James Allen Fill and Mark Huber, "The Randomness Recycler: A New Technique for Perfect Sampling," math.PR/0009242
- James M. Flegal, Murali Haran, Galin L. Jones, "Markov Chain Monte Carlo: Can We Trust the Third Significant Figure?", Statistical Science 23 (2008): 250--260, math.ST/0703746
- Cheng-Der Fuh, Huei-Wen Teng, Ren-Her Wang, "Efficient Importance Sampling for Rare Event Simulation with Applications", arxiv:1302.0583
- Axel Gandy and Patrick Rubin-Delanchy, "An algorithm to compute the power of Monte Carlo tests with guaranteed precision", Annals of Statistics 41 (2013): 125--142, arxiv:1110.1248
- Steven T. Garren and Richard L. Smith, "Estimating the second largest eigenvalue of a Markov transition matrix", Bernoulli 6 (2000): 215--242
- W. R. Gilks et al, Markov Chain Monte Carlo in Practice
- Mark Girolami, Ben Calderhead, Siu A. Chin, "Riemannian Manifold Hamiltonian Monte Carlo", arxiv:0907.1100
- Alexander K. Hartmann, "Sampling rare events: statistics of local sequence alginments," cond-mat/0108201
- Fred J. Hickernell, Lan Jiang, Yuewei Liu, Art Owen, "Guaranteed Conservative Fixed Width Confidence Intervals Via Monte Carlo Sampling", arxiv:1208.4318
- Mark L. Huber, "Approximation algorithms for the normalizing constant of Gibbs distributions", arxiv:1206.2689
- Mark Jerrum, "On the approximation of one Markov chain by another", Probability Theory and Related Fields 135 (2006): 1--14 [with special reference to MCMC]
- Michael Kastner, "Monte Carlo methods in statistical physics: Mathematical foundations and strategies", arxiv:0906.0858
- Ioannis Kontoyiannis and S. P. Meyn, "Computable exponential bounds for screened estimation and simulation", Annals of Applied Probability 18 (2008): 1491--1518, arxiv:math/0612040
- A. Lecchini-Visintini, J. Lygeros, J. Maciejowski, "Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains", 0709.2989
- Qiang Liu, Jian Peng, Alexander Ihler and John Fisher III, "Estimating the Partition Function by Discriminance Sampling", UAI 2015
- Ying Liu, Andrew Gelman, Tian Zheng, "Simulation-efficient shortest probability intervals", arxiv:1302.2142
- Neal Madras and Dana Randall, "Markov chain decomposition for convergence rate analysis", Annals of Applied Probability 12 (2002): 581--606
- Indrek Mandre, Jaan Kalda, "Efficient method of finding scaling exponents from finite-size Monte-Carlo simulations", European Physical Journal B 86 (2013): 56, arxiv:1303.0294
- J. D. Munoz, M. A. Novotny and S. J. Mitchell, "Rejection-free Monte Carlo algorithms for models with continuous degrees of freedom," Physical Review E 67 (2003): 026101
- K. P. N. Murthy, "Monte Carlo: Basics," cond-mat/0104215
- Radford M. Neal
- "Slice Sampling," physics/0009028
- "The Short-Cut Metropolis Method", math.ST/0508060
- Ronald C. Neath, "On Convergence Properties of the Monte Carlo EM Algorithm", arxiv:1206.4768
- M. A. Novotny, "A Tutorial on Advanced Dynamic Monte Carlo Methods for Systems with Discrete State Spaces," cond-mat/0109182
- Christian Robert and George Casella, Monte Carlo Statistical Methods
- Gareth O. Roberts and Jeffrey S. Rosenthal, "General state space Markov chains and MCMC algorithms", math.PR/0404033
- Sylvain Rubenthaler, Tobias Ryden and Magnus Wiktorsson, "Fast simulated annealing in $\R^d$ and an application to maximum likelihood estimation", math.PR/0609353
- R. Y. Rubinstein, "A Stochastic Minimum Cross-Entropy Method for Combinatorial Optimization and Rare-event Estimation", Methodology and Computing in Applied Probability 7 (2005): 5--50
- Tilman Sauer, "The Feynman Path Goes Monte Carlo," physics/0107010
- Pat Scott, "Pippi — painless parsing, post-processing and plotting of posterior and likelihood samples", European Physical Journal Plus 127 (2012): 138, arxiv:1206.2245
- Alexander Y. Shestopaloff, Radford M. Neal, "MCMC for non-linear state space models using ensembles of latent sequences", arxiv:1305.0320
- Gabriel Stoltz, "Path Sampling with Stochastic Dynamics: Some New Algorithms", cond-mat/0607650
- F. V. Tkachov, "Quasi-optimal observables: Attaining the quality of maximal likelihood in parameter estimation when only a MC event generator is available," arxiv:physics/0108030
- Dimiter Tsvetkov, Lyubomir Hristov, Ralitsa Angelova-Slavova, "On the convergence of the Metropolis-Hastings Markov chains", arxiv:1302.0654
- Andreas Wipf, Statistical Approach to Quantum Field Theory
- Bin Yu, "Density Estimation in the $L^{\infty}$ Norm for Dependent Data with Applications to the Gibbs Sampler", Annals of Statistics 21 (1993): 711--735