## Large Deviations

*27 Sep 2015 22:41*

The limit theorems of probability theory ---
the weak and strong laws of large numbers, the central limit theorem, etc. ---
basically say that averages taken over large samples (of well-behaved
independent, identically distributed random variables) converge on expectation
values. (The strong law of large numbers asserts almost-sure convergence, the
central limit theorem asserts a kind of convergence in distribution, etc.)
These results say little or nothing about the *rate* of convergence,
however, which is often important for many applications of probability theory,
e.g., statistical mechanics. One way to address
this is the theory of large deviations. (I believe the terminology goes back
to Varadhan in the 1970s, but that's just an impression, rather than research.)

Let me say things sloppily first, so the idea comes through, and then more
precisely, so people who know the subject won't get too upset. Suppose $ X $
is a random variable with expected value $ \mathbf{E}[X] $, and we consider \(
S_n \equiv \frac{1}{n}\sum_{i=1}^{n}{X_i} \), the sample mean of $ n $ samples
of $ X $. $ S_n $ "obeys a large deviations principle" if there is a
non-negative function $ r $, called the rate function, such that
\[
\Pr{\left(\left| \mathbf{E}[X] - S_n \right| \geq \epsilon\right)}
\rightarrow e^{-nr(\epsilon)} ~.
\]

(The rate function has to obey some sensible but technical continuity
conditions.) This is a *large* deviation result, because the difference
between the empirical mean and the expectation is remaining constant as $ n $
grows --- there has to be a larger and large conspiracy, as it were, among the
samples to keep deviating from the expectation in the same way. Now, one
reason what I've stated isn't really enough to satisfy a mathematician is that
the right-hand side converges on zero, so the functional form of the
probability could be anything which also converges on zero and that'd be
satisfied, but we want to pick out *exponential* convergence. The usual
way is to look at the limiting growth rate of the probability. Also, we want
the probability that the difference between the empirical mean and the
expectation falls into any arbitrary set. So one usually sees the LDP asserted
in some form like, for any reasonable set $ A $,
\[
\lim_{n\rightarrow\infty}{-\frac{1}{n}\log{\mathrm{Pr}\left(\left|
\mathbf{E}X - S_n \right| \in A\right)}} = \inf_{x\in A}{r(x)} ~.
\]

(Actually, to be *completely* honest, I really shouldn't be assuming
that there is a limit to those probabilities. Instead I should connect the lim
inf of that expression to the infimum of the rate function over the interior of
$ A $, and the lim sup to the infimum of the rate function over the closure of
$ A $.)

Similar large deviation principles can be stated for the empirical distribution, the empirical process, functionals of sample paths, etc., rather than just the empirical mean. There are tricks for relating LDPs on higher-level objects, like the empirical distribution over trajectories, to LDPs on lower-level objects, like empirical means. (These go under names like "the contraction principle".)

Since ergodic theory extends the probabilistic limit laws to stochastic processes, rather than just sequences of independent variables, it shouldn't be surprising that large deviation principles also hold for some stochastic processes. I am particularly interested in LDPs for Markov processes, and their applications. There are further important connections to information theory, since in an awful lot of situations, the large deviations rate function is the Kullback-Leibler divergence, a.k.a. the relative entropy.

Related, but strictly speaking distinct topics:

- Finite-sample deviation inequalities, such as the Bernstein, Chernoff and Hoeffding inequalities, which bound the probability of averages departing by more than a certain amount from expectation values at given finite sample sizes;
- Concentration of measure,
roughly speaking upper bounds on deviation probabilities holding uniformly over
large classes of functions. (Note that large deviations principles
have
*match*upper and lower bounds, and need only hold asymptotically.)

See also: Exponential Families of Probability Measures; Maximum entropy

- Recommended:
- James Bucklew, Large Deviation Techniques in Decision, Simulation, and Estimation
- Thomas Cover and Joy Thomas, Elements of Information Theory [Very nice chapter on large deviations for IID sequences]
- Amir Dembo and Ofer Zeitouni, Large Deviations Techniques and Applications [Chapters 2, 4 and 5, and parts of chapter 6, are available in postscript format via Prof. Dembo's page for his course on large deviations]
- Frank den Hollander, Large Deviations [Nice introductory text for people with an applied probability background. Short.]
- Richard S. Ellis
- "The Theory of Large Deviations: from Boltzmann's 1877
Calculation to Equilibrium Macrostates in 2D Turbulence", Physica
D
**133**(1999): 106--136 - Entropy, Large Deviations, and Statistical Mechanics

- "The Theory of Large Deviations: from Boltzmann's 1877
Calculation to Equilibrium Macrostates in 2D Turbulence", Physica
D
- M. I. Friedlin and A. D. Wentzell, Random Perturbations of Dynamical Systems
- Hugo Touchette, "The Large Deviations Approach to Statistical
Mechanics", Physics Reports
**478**(2009): 1--69, arxiv:0804.0327 - S. R. S. Varadhan, "Large
Deviations", Annals of
Probability
**36**(2008): 397--419 [Copy via Prof. Varadhan. Wald Lecture for 2005.]

- Recommended, more specialized:
- R. R. Bahadur, Some Limit Theorems in Statistics [1971. The notation is now much more transparent, and the proofs of many basic theorems considerably simplified. But if there's a better source for statistical applications than this little book, I've yet to find it.]
- Julien Barré, Freddy Bouchet, Thierry Dauxois and
Stefano Ruffo, "Large deviation techniques applied to systems with long-range
interactions", cond-mat/0406358 = Journal of Statistics
Physics
**119**(2005): 677--713 - Michel Benaïn and Jörgen W. Weibull, "Deterministic
Approximation of Stochastic Evolution in Games", Econometrica
**71**(2003): 879--903 [JSTOR] - Daniel Berend, Peter Harremoës, Aryeh Kontorovich, "Minimum KL-divergence on complements of L1 balls", arxiv:1206.6544
- Christian Borgs, Jennifer Chayes and David Gamarnik, "Convergent sequences of sparse graphs: A large deviations approach", arxiv:1302.4615 [See under graph limits]
- Arijit Chakrabarty, "Effect of truncation on large deviations for heavy-tailed random vectors", arxiv:1107.2476
- Sourav Chatterjee and S. R. S. Varadhan, "The large deviation principle for the Erdos-Renyi random graph", arxiv:1008.1946
- J.-R. Chazottes and D. Gabrielli, "Large deviations for empirical
entropies of Gibbsian sources", math.PR/0406083 = Nonlinearity
**18**(2005): 2545--2563 [This is a very cool result which shows that block entropies, and entropy rates estimated from those blocks, obey the large deviation principle even as one lets the length of the blocks grow with the amount of data, provided the block-length doesn't grow too quickly (only logarithmically). I wish I could write papers like this.] - W. De Roeck, Christian Maes and Karel Netocny, "H-Theorems from
Autonomous Equations", cond-mat/0508089 [this
basically derives the H-theorem of statistical mechanics as a large deviations
result, assuming a certain reasonable Markovian form for the macroscopic
dynamics. In fact, we have a separate argument that you
*don't*have that Markovian form, you're just not trying hard enough; see here] - Paul Dupuis, "Large Deviations Analysis of Some Recursive
Algorithms with State-Dependent Noise", Annals of Probability
**16**(1988): 1509--1536 [Open access] - Gregory L. Eyink
- "Action principle in nonequilbrium statistical
dynamics," Physical Review E
**54**(1996): 3419--3435 [Least action as a consequence of Markovian LDP] - "A Variational Formulation of Optimal Nonlinear Estimation," physics/0011049 [Nice connections between optimal state estimation (assuming a known form for the underlying stochastic process), nonequilibrium statistical mechanics, and large deviations theory, leading to tractable-looking numerical schemes for estimation.]

- "Action principle in nonequilbrium statistical
dynamics," Physical Review E
- Jin Feng and Thomas G. Kurtz, Large Deviations for Stochastic Processes [Online]
- James C. Fu, "Large Sample Point Estimation: A Large Deviation Theory Approach", Annals of Statistics
**10**(1982): 762--771 - Fuqing Gao and Xingqiu Zhao, "Delta method in large deviations and moderate deviations for estimators", Annals of Statistics
**39**(2011): 1211-1240, arxiv:1105.3552 [This is based on an extension of the "contraction principle" which is of independent interest] - Alexander Korostelev, "A minimaxity criterion in nonparametric regression based on large-deviations probabilities", Annals of Statistics
**24**(1996): 1075--1083 - Steven Orey and Stephan Peliken, "Large deviations principles for
stationary processes", Annals of Probability
**16**(1988): 1481--1495 - Eric Smith, "Large-deviation principles, stochastic effective actions, path entropies, and the structure and meaning of thermodynamic descriptions", arxiv:1102.3938
- Eric Smith and Supriya Krishnamurthy, "Symmetry and Collective Fluctuations in Evolutionary Games", SFI Working Paper 11-03-010

- To read:
- Paul H. Algoet and Brian H. Marcus, "Large Deviation Theorems for
Empirical Types of Markov Chains Constrained to Thin Sets," IEEE
Trans. Info. Theory
**38**(1992): 1276--1291 - F. Altarelli, A. Braunstein, L. Dallâ€™Asta, and R. Zecchina, "Large deviations of cascade processes on graphs", Physical Review E
**87**(2013): 062115, arxiv:1305.5745 - Alexei Andreanov, Giulio Biroli, Jean-Philippe Bouchaud, and
Alexandre Lefèvre, "Field theories and exact stochastic equations for
interacting particle
systems", Physical
Review E
**74**(2006): 030101 = cond-mat/0602307 - David Andrieux, "Equivalence classes for large deviations", arxiv:1208.5699
- Ellen Baake, Frank den Hollander and Natali Zint, "How T-Cells Use
Large Deviations to Recognize Foreign
Antigens", arxiv:q-bio.SC/0605016 [Presumably == the paper of the same title in Journal of Mathematical Biology
**57**(2008): 841--861, but that orders the authors Zint, Baake and den Hollander.] - J. Barral and P. Goncalves, "On the Estimation of the Large Deviations Spectrum", Journal of Statistical
Physics
**144**(2011): 1256--1283 - L. Bertini, A. De Sole, D. Gabrielli, G. Jona-Lasinio, C. Landim, "Large deviation approach to non equilibrium processes in stochastic lattice gases", arxiv:math/0602557
- Matthias Birkner, Andreas Greven and Frank den Hollander, "Quenched large deviation principle for words in a letter sequence", arxiv:0807.2611
- Igor Bjelakovic, Jean-Dominique Deuschel, Tyll Krueger, Ruedi
Seiler, Rainer Siegmund-Schultze and Arleta Szkola
- "A quantum version of Sanov's
theorem", quant-ph/0412157
= Communications in
Mathematical Physics
**260**(2005): 659--671 [Quantum large deviations!] - "Typical support and Sanov large deviations of correlated
states", math.PR/0703772
= Communications in
Mathematical Physics
**279**(2008): 559--584

- "A quantum version of Sanov's
theorem", quant-ph/0412157
= Communications in
Mathematical Physics
- Amarjit Budhiraja, Paul Dupuis, Markus Fischer, "Large deviation properties of weakly interacting processes via weak convergence methods", arxiv:1009.6030
- Amarjit Budhiraja, Paul Dupuis, Vasileios Maroulas, "Large
deviations for infinite dimensional stochastic dynamical systems", Annals
of Applied Probability
**36**(2008): 1390--1420 = arxiv:0808.3631 - Raphaël Cerf and Pierre Petit, "Cramér's theorem for asymptotically decoupled fields", arxiv:1103.4415 [The English abstract is extremely interesting, but unfortunately this paper is in French, so my marking it "to read" is misleading.]
- Arijit Chakrabarty, "Central Limit Theorem and Large Deviations for truncated heavy-tailed random vectors", arxiv:1003.2159
- Po-Ning Chen, "Generalization of Gartner-Ellis theorem",
IEEE Transactions on
Information Theory
**46**(2000): 2752--2760 - Zhiyi Chi
- "Large deviations for template matching between point
processes", Annals of Applied
Probability
**15**(2005): 153--174 = math.PR/0503463 - "On the asymptotic of likelihood ratios for self-normalized large deviations", arxiv:0709.1506

- "Large deviations for template matching between point
processes", Annals of Applied
Probability
- Igor Chueshov and Annie Millet, "Stochastic 2D hydrodynamical type systems: Well posedness and large deviations", arxiv:0807.1810
- A. de Acosta, "A general nonconvex large deviation result
II", Annals of Probability
**32**(2004): 1873--1901 = math.PR/0410101 - Zach Deitz and Sunder Sethuraman, "Large deviations for a class of nonhomgeneous Markov chains", math.PR/0404230
- Frank den Hollander, Julien Poisat, "Large deviation principles for words drawn from correlated letter sequences", arxiv:1303.5383
- B. Derrida, "Non equilibrium steady states: fluctuations and large deviations of the density and of the current", cond-mat/0703762
- B. Derrida, Joel L. Lebowitz and Eugene R. Speer, "Exact Large Deviation Functional for the Density Profile in a Stationary Nonequilibrium Open System," cond-mat/0105110
- Manh Hong Duong, Mark A. Peletier, Upanshu Sharma, "Coarse-graining and fluctuations: Two birds with one stone", arxiv:1404.1466
- Paul Dupuis and Richard S. Ellis, A Weak Convergence Approach to the Theory of Large Deviations [PDF preprint]
- Vlad Elgart and Alex Kamenev, "Rare Events Statistics in Reaction--Diffusion Systems", cond-mat/0404241 [i.e., large deviations]
- Andreas Engel, Remi Monasson and Alexander K. Hartmann, "On Large
Deviation Properties of Erdos-Renyi Random Graphs", Journal of Statistical
Physics
**117**(2004): 387--426 - Mikhail Ermakov, "A moderate deviation principle for empirical bootstrap measure", arxiv:1206.1459
- Parisa Fatheddin, Jie Xiong, "Large Deviation Principle for Some Measure-Valued Processes", arxiv:1204.3501
- Hans Follmer and Steven Orey, "Large Deviations for the Empirical Field of
a Gibbs Measure", Annals of Probability
**16**(1988): 961--977 - Jorge Garcia, "A Large Deviation Principle for Stochastic
Integrals",
Journal of
Theoretical Probability
**21**(2008): 476--501 - Cristian Giardina', Jorge Kurchan, Luca Peliti, "Direct evaluation of large-deviation functions", cond-mat/0511248 ["numerical [evaluation of] probabilities of large deviations of physical quantities, such as current or density, that are local in time. The large-deviation functions are given in terms of the typical properties of a modified dynamics, and since they no longer involve rare events, can be evaluated efficiently and over a wider ranges of values."]
- Yuri Golubev, Vladimir Spokoiny, "Exponential bounds for minimum contrast estimators", arxiv:0901.0655
- Nathael Gozlan and Christian Léonard
- "A large deviation approach to some transportation cost inequalities", math.PR/0510601
- "Transport inequalities. A survey", arxiv:1003.3852

- Alice Guionnet, "Large deviations and stochastic calculus for large
random matrices", Probability
Surveys
**1**(2004): 72--172 [Open access] - O. V. Gulinskii
and R. S. Liptser, "Example of
Large Deviations for Stationary Processes", Theory of Probability and
Applications
**44**(1999): 211--225 [PDF] - Te Sun Han
- "Hypothesis Testing with the General Source",
IEEE Transactions on
Information Theory
**46**(2000): 2415--2427 = math.PR/0004121 ["The asymptotically optimal hypothesis testing problem with the general sources as the null and alternative hypotheses is studied.... Our fundamental philosophy in doing so is first to convert all of the hypothesis testing problems completely to the pertinent computation problems in the large deviation-probability theory. ... [This] enables us to establish quite compact general formulas of the optimal exponents of the second kind of error and correct testing probabbilities for the general sources including all nonstationary and/or nonergodic sources with arbitrary abstract alphabet (countable or uncountable). Such general formulas are presented from the information-spectrum point of view."] - "An information-spectrum approach to large deviation theorems", cs.IT/0606104

- "Hypothesis Testing with the General Source",
IEEE Transactions on
Information Theory
- Zhishui Hu, John Robinson, Qiying Wang, "Cramér-type large deviations for samples from a finite population", Annals of Statistics
**35**(2007): 673--696, arxiv:0708.1880 - Dayu Huang, Sean Meyn, "Generalized Error Exponents For Small Sample Universal Hypothesis Testing",
IEEE Transactions on Information Theory
**59**(2013): 8157--8181, arxiv:1204.1563 - Henrik Hult and Gennady Samorodnitsky, "Large deviations for point processes based on stationary sequences with heavy tails", Journal of Applied Probability
**47**(2010): 1--40 - Svante Janson, "Large deviations for sums of partly dependent
random variables", Random
Structures and Algorithms
**24**(2004): 234--248 ["We use and extend a method by Hoeffding to obtain strong large deviation bounds for sums of dependent random variables with suitable dependency structure. The method is based on breaking up the sum into sums of independent variables. Applications are given to U-statistics, random strings and random graphs." Applied here only to Erdos-Renyi (IID) random graphs, but might be extendable to Markov random graphs...? PDF preprint] - Giovanni Jona-Lasinio, "From fluctuations in hydrodynamics to nonequilibrium thermodynamics", arxiv:1003.4164
- Vladislav Kargin, "A Large Deviation Inequality for Vector Functions on Finite Reversible Markov Chains", math.PR/0508538
- Gerhard Keller, Equilibrium States in Ergodic Theory
- Michael Keyl, "Quantum state estimation and large deviations", quant-ph/0412053
- Yuri Kifer, "Large deviations and adiabatic transitions for dynamical systems and Markov processes in fully coupled averaging", arxiv:0710.2405
- Yuri Kifer, S. R. S. Varadhan, "Nonconventional Large Deviations Theorems", Probability Theory and Related Fields
**158**(2014): 197--224, arxiv:1206.0156 - Yuichi Kitamura, "Empirical likelihood methods in econometrics: Theory and Practice", Cowles Foundation Discussion Paper No. 1569 (2006)
- F. Klebaner
and R. Liptser, "Large
Deviations for Past-Dependent
Recursions", math.PR/0603407
[Corrected version of Problems of Information
Transmission
**32**(1996): 23--34] - Ioannis Kontoyiannis and S. P. Meyn
- "Large deviations asymptotics and the spectral theory of
multiplicatively regular Markov processes",
math.PR/0509310
= Electronic Journal of Probability
**10**(2005): 61--123 - "Spectral Theory and Limit Theorems for Geometrically
Ergodic Markov
Processes", math.PR/0209200
= Annals of Applied
Probability
**13**(2003): 304--362 - "Computable exponential bounds for screened estimation and simulation", Annals of Applied Probability
**18**(2008): 1491--1518, arxiv:math/0612040

- "Large deviations asymptotics and the spectral theory of
multiplicatively regular Markov processes",
math.PR/0509310
= Electronic Journal of Probability
- D. Lacoste, A. W. C. Lau and K. Mallick, "Fluctuation theorem and large deviation function for a solvable model of a molecular motor", Physical Review E
**78**(2008): 011915 - Vivien Lecomte, Cécile Appert-Rolland, and
Frédéric van Wijland
- "Thermodynamic formalism for systems with Markov dynamics", cond-mat/0606211
- "Thermodynamic formalism and large deviation functions in continuous time Markov dynamics", cond-mat/0703435

- Vivien Lecomte and Julien Tailleur, "A numerical approach to large
deviations in continuous
time", Journal of
Statistical Mechanics: Theory and Experiment
**2007**: P03004 - Raphael Lefevere, Mauro Mariani, Lorenzo Zambotti, "Large deviations for renewal processes", arxiv:1009.2659
- Christian Léonard , "Entropic Projections and Dominating Points", ESAIM: Probability and Statistics
**14**(2010): 343--381, arxiv:0711.0206 ["Generalized entropic projections and dominating points are solutions to convex minimization problems related to conditional laws of large numbers"] - Robert Sh. Liptser and Anatolii A. Pukhalskii, "Limit theorems on large deviations for semimartingales", math.PR/0510028 [But published in a journal in 1992]
- Fotis Loukissas, "Precise Large Deviations for Long-Tailed Distributions", Journal of Theoretical Probability
**25**(2012): 913--924 - Yutao Ma, Ran Wang, Liming Wu, "Moderate Deviation Principle for dynamical systems with small random perturbation", arxiv:1107.3432
- Claudio Macci, "Large Deviations for Empirical Estimators of the Stationary Distribution of a Semi-Markov Process with Finite State Space",
Communications in
Statistics: Theory and Methods
**37**(2008): 3077--3089 - Satya N. Majumdar and Alan J. Bray, "Large-Deviation Functions for
Nonlinear Functionals of a Gaussian Stationary Markov Process", cond-mat/0202138
= Physical Review E
**65**(2002): 051112 - Mario Filiasi, Giacomo Livan, Matteo Marsili, Maria Peressi, Erik Vesselli, Elia Zarinelli, "On the concentration of large deviations for fat tailed distributions, with application to financial data", arxiv:1201.2817
- David McAllester, "A Statistical Mechanics Approach to Large Devations Theorems" [E-print available via CiteSeer --- published?]
- Thomas Mikosch, Olivier Wintenberger, "Precise large deviations for dependent regularly varying sequences", arxiv:1206.1395
- Abdelkader Mokkadem, Mariane Pelletier and Baba Thiam, "Large and moderate deviations principles for kernel estimators of the multivariate regression", math.ST/0703341
- K. Netocny and F. Redig, "Large deviations for quantum spin
systems", math-ph/0404018
= Journal of Statistical Physics
**117**(2004): 521--547 - Enzo Olivieri and Maria Eulalia Vares, Large Deviations and Metastability
- Magda Peligrad, Hailin Sang, Yunda Zhong, Wei Biao Wu, "Exact Moderate and Large Deviations for Linear Processes", arxiv:1111.0537
- Huyen Pham, "Some applications and methods of large deviations in finance and insurance",math.PR/0702473
- Mark Pollicott and Richard Sharp, "Large Deviations, Fluctuations and Shrinking Intervals", Communications
in Mathematical Physics
**290**(2009): 321--334 - Anatoly Puhalskii, Large Deviations and Idempotent Probability
- Anatolii A. Puhalskii, "Stochastic processes in random graphs", math.PR/0402183 [Large deviations for Erdos-Renyi graphs. Memo to self: how much work would it be to extend this to Markovian graphs?]
- Hong Qian, "Relative Entropy: Free Energy Associated with
Equilibrium Fluctuations and Nonequilibrium
Deviations", math-ph/0007010
= Physical
Review E
**63**(2001): 042103 - Olivier Rivoire, "The cavity method for large deviations", cond-mat/0506164 = Journal of Statistical Mechanics: Theory and Experiment (2005): P07004 ["A method is introduced for studying large deviations in the context of statistical physics of disordered systems. The approach, based on an extension of the cavity method to atypical realizations of the quenched disorder, allows us to compute exponentially small probabilities (rate functions) over different classes of random graphs."]
- David Ruelle, Thermodynamic Formalism
- Shin-ichi Sasa, "Physics of Large Deviation", arxiv:1204.5584
- L. Saulis and V. A. Statulevicius, Limit Theorems for Large Deviations
- Carolyn Schroeder, "I-Projection and Conditional Limit Theorems
for Discrete Parameter Markov Processes", Annals of Probability
**21**(1993): 721--758 - Adam Shwartz, Large Deviations in Performance Modeling
- Joe Suzuki, "A Markov chain analysis of genetic algorithms: large deviation principle approach", Journal of Applied
Probability
**47**(2010): 967--975 - Vincent Y. F. Tan, Animashree Anandkumar, Lang Tong and Alan
S. Willsky, "A Large-Deviation Analysis of the Maximum-Likelihood Learning of
Markov Tree
Structures", IEEE Transactions on Information Theory
**57**(2011): 1714--1735, arxiv:0905.0940 [Large deviations for Chow-Liu trees] - Hugo Touchette, Rosemary J. Harris, "Large deviation approach to nonequilibrium systems", arxiv:1110.5216
- José Trashorras, Olivier Wintenberger, "Large deviations for bootstrapped empirical measures", arxiv:1110.4620
- A. Vulpiani, F. Cecconi, M. Cencini, A. Puglisi and D. Vergni (eds.)m Large Deviations in Physics: The Legacy of the Law of Large Numbers
- Wei Wang, A. J. Roberts and Jinqiao Duan, "Large deviations for slow-fast stochastic partial differential equations", arxiv:1001.4826
- Lingjiong Zhu, "Process-Level Large Deviations for General Hawkes Processes", arxiv:1108.2431

- To write:
- CRS, "Large Deviations in Exponential Families of Stochastic Automata"

*Previous versions*: 2005-11-09 17:39 (but not the first version by any means)