Characterizing Mixtures of Processes by Summarizing Statistics
27 Aug 2019 12:39
A statistic is sometimes said to summarize a probability measure if the probability of a sample is a function only of the value of the statistic. A statistic which summarizes all the distributions in some family is a sufficient statistic for that family (by the Neyman factorization criterion). The empirical measure summarizes any stochastic process which consists of independent and identically distributed observations. Conversely, any stochastic process summarized by the empirical measure is a mixture of IID processes, i.e., exchangeable. Any Markov process, similarly, is summarized by the empirical measure of successive pairs of observations, plus the first observation, and any process summarized by the empirical pair measure is a mixture of Markov processes (thus Diaconis and Freedman). Similarly, any process summarized by the empirical measure of \( k \)-tuples is a Markov process of order \( k-1 \).
In large deviations theory, we use the summarizing properties to prove exponential convergence of empirical distributions to true distributions — for IID processes we start with the convergence of the empirical measure, but for Markov processes we start with the empirical pair measure (and get the convergence of the one-dimensional distribution by integration). We can also extend these to higher-dimensional distributions, in the (projective) limit looking at the empirical path measure (sometimes, in a namespace collision, the "empirical process"), which we get by taking infinite cyclic repetitions of the observed sample path and all its shifts. (That is, having observed the sequence \( x_1, x_2, x_3 \ldots x_{n-1}, x_n \), the empirical path measure puts equal probability on \( x_1, x_2, x_3, \ldots x_{n-1}, x_n \), repeated forever; on \( x_2, x_3, \ldots x_{n-1}, x_n, x_1 \) repeated forever; and so on to repeating \( x_n, x_1, x_2, x_3, \ldots x_{n-1} \) forever.)
Query: For what class of processes is the empirical path measure summarizing? Query: Is this class also a convex set? If so, what are the extremal points? Conjecture: The class is the class of stationary processes, in which case the extremal points are the stationary and ergodic measures.
If you would like to collaborate on this, get in touch. If you would like to work on it on your own, please be so good as to give me credit. If you know where this problem is already solved, please let me know.
See also: Ergodic Theory; Exchangeable Random Sequences; Graph Limits and Infinite Exchangeable arrays
- Recommended, big picture:
- Steffen Lauritzen, "Sufficiency, Partial Exchangeability, and Exponential Families" [PDF]
- Recommended, close-ups:
- Persi Diaconis and David Freedman, "De Finetti's Theorem for Markov Chains", Annals of Probability 8 (1980): 115--130
- Matthew O. Jackson, Ehud Kalai and Rann Smorodinsky, "Bayesian Representation of Stochastic Processes Under Learning: De Finetti Revisited", Econometrica 67 (1999): 875--893 [JSTOR]
- Steffen Lauritzen, "Harmonic Analysis of Symmetric Random Graphs", arxiv:1908.06456
- Modesty forbids me to recommend:
- CRS and Aryeh Kontorovich, "Predictive PAC Learning and Process Decompositions", NIPS 2013, arxiv:1309.4859
- To read:
- Hugo Harari-Kermadec, "Regenerative block empirical likelihood for Markov chains", arxiv:1102.3107 [I can't tell from the abstract if this belongs here or not]
- Paul Ressel
- "De Finetti-type theorems: An analytical approach", Annals of Probability 13 (1985): 898--922
- "Integral Representations for Distributions of Symmetric Stochastic Processes", Probability Theory and Related Fields 79 (1988): 451--467