Exponential Families of Probability Measures

Last update: 21 Apr 2025 21:17
First version: 20 February 2011

I should explain what these area, but having done so elsewhere, I am feeling disinclined to do it again. (Later, I should just copy that text.)

I am particularly interested in exponential families for time series (very natural for Markov models) and for networks. More generally, if I have a family of stochastic processes (collections of dependent random variables) which form exponential families, what constraints does that put on the process?

Exponential families correspond to canonical ensembles in statistical mechanics. (Natural sufficient statistics : natural parameters :: extensive macroscopic variables : conjugate intensive variables.) In statistical mechanics, one of the justifications for using canonical ensembles for large systems comes from large deviations theory. Is there something equivalent in statistics proper? (Roussas's results on local asymptotic approximation of parametric models by exponential families feels like it should be connected here.)

Cumulants, and Cumulant Generating Functions
Information geometry
Large deviations
Maximum entropy
Statistical mechanics
Statistics in general
Sufficient statistics

Lawrence D. Brown, Fundamentals of Statistical Exponential Families: with Applications in Statistical Decision Theory [All you ever wanted to know, really. Now open access.]
Benoit Mandelbrot, "The Role of Sufficiency and of Estimation in Thermodynamics", Annals of Mathematical Statistics 33 (1962): 1021--1038 [Still one of the best discussions of the interplay between formal, statistical and substantive motivations for exponential families.]
Mark Schervish's Theory of Statistics [Exponential families are central enough to statistical theory that any good textbook will have decent coverage of the same key topics, but I found Mark's treatment particularly clear and streamlined before he became my department chair.]

Peter Guttorp, Stochastic Modeling of Scientific Data [Gives nice discussions and examples of using exponential families, and their properties, to model dependent data]
Andee Kaplan, Daniel Nordman, Stephen Vardeman, "On the instability and degeneracy of deep learning models", arxiv:1612.01159
Rudolf Kulhavy, Recursive Nonlinear Estimation: A Geometric Approach [Emphasizing information-geometric aspects]
Steffen L. Lauritzen
- Extremal Families and Systems of Sufficient Statistics
- "Extreme Point Models in Statistics", Scandinavian Journal of Statistics 11 (1984): 65--91 [Highlights of the book, without proofs but with decent typography. Includes some of his very interesting algebraic extensions to the usual notions of exponential families. JSTOR]
George G. Roussas, Contiguity of Probability Measures: Some Applications in Statistics [Asymptotic theory of exponential-family approximation, estimation and testing, for discrete-time Markov processes on fairly general state-spaces.]
Eric P. Xing, Michael I. Jordan, Stuart Russell, "A Generalized Mean Field Algorithm for Variational Inference in Exponential Families", UAI 2003, arxiv:1212.2512
Lin Yuan, Sergey Kirshner, Robert Givan, "Estimating Densities with Non-Parametric Exponential Families", arxiv:1206.5036

CRS and Alessandro Rinaldo, "Consistency under Sampling of Exponential Random Graph Models", Annals of Statistics 41 (2013): 508--535, arxiv:1111.3054 [Our results are actually about exponential families of stochastic processes in general, though inspired by and applied to puzzles arising from the ERGM situation]

O. E. Barndorff-Nielsen, Information and Exponential Families
Uwe Küchler and Michael Sørensen, Exponential Families of Stochastic Processes

Arvind Agarwal, Hal Daume III, "Generative Kernels for Exponential Families", AISTATS 2011
Karim Anaya-Izquierdo, Paul Marriott, "Local mixture models of exponential families", Bernoulli 13 (2007): 623--640, arxiv:0709.0447
Andrew R. Barron and Chyong-Hwa Sheu, "Approximation of Density Functions by Sequences of Exponential Families", Annals of Statistics 19 (1991): 1347--1369
Alexandre Belloni, Victor Chernozhukov, "Posterior Inference in Curved Exponential Families under Increasing Dimensions", arxiv:0904.3132
Imre Csiszar and Frantisek Matus, "Closures of exponential families", Annals of Probability 33 (2005): 582--600, math.PR/0503653
J. L. Denny, "Sufficient Conditions for a Family of Probabilities to be Exponential", Proceedings of the National Academy of Sciences 57 (1967): 1184--1187 [If we can find sufficient statistics which are averages over data points, and the data points are statistically independent, then the probability model must be an exponential family.]
Yonina C. Eldar, "Generalized SURE for Exponential Families: Applications to Regularization", arxiv:0804.3010
Mark L. Huber, "Approximation algorithms for the normalizing constant of Gibbs distributions", arxiv:1206.2689
Alexander Jung, Sebastian Schmutzhard, Franz Hlawatsch, "The RKHS Approach to Minimum Variance Estimation Revisited: Variance Bounds, Sufficient Statistics, and Exponential Families", arxiv:1210.6516
Sham Kakade, Ohad Shamir, Karthik Sindharan, Ambuj Tewari, "Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity", Journal of Machine Learning Research Proceedings 9 (2010): 381--388
Qiang Liu, Jian Peng, Alexander Ihler and John Fisher III, "Estimating the Partition Function by Discriminance Sampling", UAI 2015
Richard Lockhart and Federico O'Reilly, "A note on Moore's conjecture", Statistics and Probability Letters 74 (2005): 212--220 ["We establish the conjecture of Moore ... that the usual plug-in estimate of a distribution function and the Rao-Blackwell estimate of the distribution function are asymptotically equivalent for a wide class of exponential family distributions."]
Claudio Macci, Mauro Piccioni, "An inverse Sanov theorem for exponential families", arxiv:2111.14152
Frank Nielsen, "Chernoff information of exponential families", arxiv:1102.2684
Saisuke Okabayashi and Charles J. Geyer, "Long range search for maximum likelihood in exponential families", Electronic Journal of Statistics 6 (2012): 123--147
Johannes Rauh, "Optimally approximating exponential families", arxiv:1111.0483
Vincent Rivoirard, Judith Rousseau, "Posterior Concentration Rates for Infinite Dimensional Exponential Families", Bayesian Analysis 7 (2012): 311--334
Bharath Sriperumbudur, Kenji Fukumizu, Arthur Gretton, Aapo Hyvärinen, Revant Kumar, "Density Estimation in Infinite Dimensional Exponential Families", Journal of Machine Learning Research 18:57 (2017): 1--59
Rolf Sundberg, Statistical Modelling by Exponential Families
Martin J. Wainwright and Michael I. Jordan, "Graphical Models, Exponential Families, and Variational Inference", Foundations and Trends in Machine Learning 1 (2008): 1--305 [PDF reprint via Prof. Jordan]
Eunho Yang, Pradeep K. Ravikumar, Genevera I. Allen, Zhandong Liu, "Conditional Random Fields via Univariate Exponential Families", NIPS 2013

CRS, "Exponential Families of Stochastic Automata and Their Mixtures"