Concentration of Measure

Last update: 07 Jul 2025 13:48
First version: 5 June 2010

Roughly speaking, one says that a probability measure is "concentrated" if any set of positive probability can be expanded very slightly to contain most of the probability. A classic example (which I learned in statistical mechanics) is the uniform distribution on the interior of unit spheres in high dimensions. A little bit of calculus shows that, as the number of dimensions grows, the ratio of the surface area to the volume does too, so a shell of thin, constant thickness on the interior of the sphere ends up containing almost all of its volume, and overlaps heavily with any set of positive probability. Equivalently, all well-behaved functional of a concentrated measure are very close to their expectation values with very high probability. (This is equivalent because one can specify a probability measure either through giving the probabilities of sets in the sigma-field, or by giving the expectation values of a sufficiently rich set of test functions, e.g., all bounded and continuous functions. [At least, bounded and continuous functions are enough for the weak topology on measures, but if you understand that qualification then you know what I'm saying anyway.])

In applications to probability, one is typically concerned with "concentration bounds", which are statements of the order of "for samples of size $ n $, the probability that any function $ f $ in a well-behaved class $ \mathcal{F} $ differs from its expectation value by $ \epsilon $ or more is at most $ C e^{-nr(\epsilon)} $", with explicit values of the prefactor $ C $ and the rate function $ r $; these might depend on the true distribution and on the function class $ \mathcal{F} $, but not on the specific function $ f $.

These finite-sample upper bounds are to be distinguished from asymptotic results giving matching upper and lower bounds (large deviations theory proper). They are also distinct from deviation inequalities which may depend on the specific function $ f $, though obviously finding such deviation inequalities is often a first step to proving concentration-of-measure results.

(Most of what I know about this subject I've learned from my sometime co-author, and once-upon-a-time student, Aryeh Kontorovich, but don't blame him for my mis-apprehensions.)

Empirical Process Theory Learning Theory
Model Selection
Statistical Mechanics

Stéphane Boucheron, Gábor Lugosi, and Pascal Massart, Concentration Inequalities: A Nonasymptotic Theory of Independence [Mini-review]
Andreas Maurer, "Thermodynamics and Concentration", Bernoulli 18 (2012): 434--454, arxiv:1205.1595[Deriving concentration bounds from statistical-mechanical arguments; very nice.]
Maxim Raginsky, Igal Sason, Foundations and Trends in Communications and Information Theory 10 (2013): 1--246, "Concentration of Measure Inequalities in Information Theory, Communications and Coding", arxiv:1212.4663 [Mini-review]

Sergey Bobkov, Mokshay Madiman, "Concentration of the information in data with log-concave distributions", Annals of Probability 39 (2011): 1528--1543, arxiv:1012.5457
Stéphane Boucheron, Gábor Lugosi, and Pascal Massart, "Concentration inequalities using the entropy method", Annals of Probability 31 (2003): 1583--1614
Sourav Chatterjee and Partha S. Dey, "Applications of Stein's method for concentration inequalities", Annals of Probability 38 (2010): 2443--2485, arxiv:0906.1034
J.-R. Chazottes, P. Collet, C. Kuelske, and F. Redig, "Concentration inequalities for random fields via coupling", Probability Theory and Related Fields 137 (2007): 201--225, math.PR/0503483
Leonid (Aryeh) Kontorovich [Comments]
- "Obtaining Measure Concentration from Markov Contraction", arxiv:0711.0987 [I really don't know why Leo bothered to take my class]
- "Measure Concentration of Hidden Markov Processes", math.PR/0608064
- "Measure concentration of Markov Tree Processes", math.PR/0608511
- "Concentration in unbounded metric spaces and algorithmic stability", pp. 2 --36 in ICML 2014, arxiv:1309.1007
Aryeh Kontorovich and Anthony Brockwell, "A Strong Law of Large Numbers for Strongly Mixing Processes", arxiv:0807.4665
Aryeh Kontorovich and Maxim Raginsky, "Concentration of measure without independence: a unified approach via the martingale method", arxiv:1602.00721
Leonid Kontorovich and Kavita Ramanan, "Concentration Inequalities for Dependent Random Variables via the Martingale Method", Annals of Probability 36 (2008): 2126--2158, arxiv:math/0609835
Pascal Massart, Concentration Inequalities and Model Selection [Using empirical process theory to get finite-sample, i.e., non-asymptotic, risk bounds for various forms of model selection. Available for free as a large PDF preprint.]
V. Maume-Deschamps, "Concentration Inequalities and Estimation of Conditional Probabilities" [PDF]
Huiming Zhang, Song Xi Chen, "Concentration Inequalities for Statistical Inference", arxiv:2011.02258 [A review of useful results]

Sylvain Arlot, Gilles Blanchard, and Etienne Roquain, "Some nonasymptotic results on resampling in high dimension, I: Confidence regions", Annals of Statistics 38 (2010): 51--82
Alexander Barovnik, lecture notes on Measure Concentration
Denis Belomestny, Vladimir Spokoiny, "Concentration inequalities for smooth random fields", arxiv:1307.1565
S. G. Bobkov and F. Götze, "Concentration of empirical distribution functions with applications to non-i.i.d. models", Bernoulli 16 (2010): 1385--1414, arxiv:1011.6165
Stéphane Boucheron and Maud Thomas
- "Concentration inequalities for order statistics", arxiv:1207.7209
- "Tail index estimation, concentration and adaptivity", arxiv:1503.05077
Pietro Caputo, Georg Menz, Prasad Tetali, "Approximate tensorization of entropy at high temperature", arxiv:1405.0608
Louis H. Y. Chen, Xiao Fang, "Multivariate Normal Approximation by Stein's Method: The Concentration Inequality Approach", arxiv:1111.4073
Pierre Del Moral, Peng Hu and Liming Wu, "On the Concentration Properties of Interacting Particle Processes", Foundations and Trends in Machine Learning 3 (2012): 225--389, arxiv:1107.1948
Evarist Giné, Vladimir Koltchinskii, Wenbo Li, Joel Zinn, "High Dimensional Probability", arxiv:math/0612726
Sébastien Gouëzel and Ian Melbourne, "Moment bounds and concentration inequalities for slowly mixing dynamical systems", arxiv:1404.0645
Nathael Gozlan, Christian Léonard, "Transport Inequalities. A Survey", arxiv:1003.3852
Arnaud Guillin, Aldéric Joulin, "Measure concentration through non-Lipschitz observables and functional inequalities", arxiv:1202.2341
Russell Impagliazzo, Valentine Kabanets, "Constructive Proofs of Concentration Bounds", Electronic Colloquium on Computational Complexity TR10-072 [From my superficial and biased inspection, the most impenetrably non-probabilistic proof of a probabilistic result I have ever seen. But potentially interesting.]
Can M. Le, Elizabeta Levina and Roman Vershynin, "Concentration and regularization of random graphs", arxiv:1506.00669
Jing Lei, "Convergence and concentration of empirical measures under Wasserstein distance in unbounded functional spaces", Bernoulli 26 (2020): 767--798
Ledoux, The Concentration of Measure Phenomenon
Johannes C. Lederer, Sara A. van de Geer, "New Concentration Inequalities for Suprema of Empirical Processes", arxiv:1111.3486
Jay Mardia, Jiantao Jiao, Ervin Tánczos, Robert D. Nowak, Tsachy Weissman, "Concentration Inequalities for the Empirical Distribution", arxiv:1809.06522
Cesar Maldonado, "Fluctuation Bounds for Chaos Plus Noise in Dynamical Systems", Journal of Statistical Physics 148 (2012): 548--564
Nikolai Matni, Stephen Tu, "A Tutorial on Concentration Bounds for System Identification", arxiv:1906.11395
Katalin Marton
- "Bounding $\bar{d}$-distance by informational divergence: a method to prove measure concentration", Annals of Probability 24 (1996): 857--866 [Thanks to Leo Kontorovich for the pointer]
- "Logarithmic Sobolev inequalities in discrete product spaces: a proof by a transportation cost distance", arxiv:1507.02803
Nikolai Matni, Stephen Tu, "A Tutorial on Concentration Bounds for System Identification", arxiv:1906.11395
Elizabeth Meckes, "Quantitative asymptotics of graphical projection pursuit", arxiv:0811.2769
Daniel Paulin, "Concentration inequalities for Markov chains by Marton couplings", arxiv:1212.2015
Iosif Pinelis, "Optimal re-centering bounds, with applications to Rosenthal-type concentration of measure inequalities", arxiv:1111.2622
Nathan Ross, "Fundamentals of Stein's method", arxiv:1109.1880
Timothy John Sullivan, Houman Owhadi, "Equivalence of concentration inequalities for linear and non-linear functions", arxiv:1009.4913
Talagrand, The Generic Chaining
Joel A. Tropp, "An Introduction to Matrix Concentration Inequalities", arxiv:1501.01571
Ramon van Handel, Probability in High Dimension [PDF lecture notes]
Sara van de Geer
- "Statistical Theory for High-Dimensional Models", arxiv:1409.8557 [I presume that most of this is now also in her book on high-dimensional statistics]
- "On the uniform convergence of empirical norms and inner products, with application to causal inference", arxiv:1310.5523
Roman Vershynin, High-Dimensional Probability: An Introduction with Applications in Data Science