## Characterizing Mixtures of Processes by Summarizing Statistics

*27 Feb 2017 16:30*

A statistic is sometimes said to **summarize** a probability
measure if the probability of a sample is a function only of the value of the
statistic. A statistic which summarizes all the distributions in some family
is a sufficient statistic for that
family (by the Neyman factorization criterion). The empirical measure
summarizes any stochastic process which consists of independent and identically
distributed observations. Conversely, any stochastic process summarized by the
empirical measure is a mixture of IID processes,
i.e., exchangeable.
Any Markov process, similarly, is summarized by the
empirical measure of successive pairs of observations, plus the first
observation, and any process summarized by the empirical pair measure is a
mixture of Markov processes (thus Diaconis and Freedman). Similarly, any
process summarized by the empirical measure of \( k \)-tuples is a Markov
process of order \( k-1 \).

In large deviations theory, we use the
summarizing properties to prove exponential convergence of empirical
distributions to true distributions — for IID processes we start with the
convergence of the empirical measure, but for Markov processes we start with
the empirical pair measure (and get the convergence of the one-dimensional
distribution by integration). We can also extend these to higher-dimensional
distributions, in the (projective) limit looking at the **empirical path
measure** (sometimes, in
a namespace collision, the
"empirical process"), which we get by taking infinite cyclic repetitions of the
observed sample path and all its shifts. (That is, having observed the
sequence \( x_1, x_2, x_3 \ldots x_{n-1}, x_n \), the empirical path measure
puts equal probability on \( x_1, x_2, x_3, \ldots x_{n-1}, x_n \), repeated
forever; on \( x_2, x_3, \ldots x_{n-1}, x_n, x_1 \) repeated forever; and so
on to repeating \( x_n, x_1, x_2, x_3, \ldots x_{n-1} \) forever.)

*Query*: For what class of processes is the empirical path measure
summarizing? *Query*: Is this class also a convex set? If so, what are
the extremal points? *Conjecture*: The class is the class of stationary processes, in which case the extremal points are the stationary and ergodic
measures.

If you would like to collaborate on this, get in touch. If you would like to work on it on your own, please be so good as to give me credit. If you know where this problem is already solved, please let me know.

See also: Ergodic Theory; Exchangeable Random Sequences; Graph Limits and Infinite Exchangeable arrays

- Recommended, big picture:
- Steffen Lauritzen, "Sufficiency, Partial Exchangeability, and Exponential Families" [PDF]

- Recommended, close-ups:
- Persi Diaconis and David Freedman, "De Finetti's Theorem for Markov Chains", Annals of Probability
**8**(1980): 115--130 - Matthew O. Jackson, Ehud Kalai and Rann Smorodinsky, "Bayesian Representation of Stochastic Processes Under Learning: De Finetti Revisited",
Econometrica
**67**(1999): 875--893 [JSTOR]

- Modesty forbids me to recommend:
- CRS and Aryeh Kontorovich, "Predictive PAC Learning and Process Decompositions", NIPS 2013, arxiv:1309.4859

- To read:
- Hugo Harari-Kermadec, "Regenerative block empirical likelihood for Markov chains", arxiv:1102.3107 [I can't tell from the abstract if this belongs here or not]
- Paul Ressel
- "De Finetti-type theorems: An analytical approach",
Annals of Probability
**13**(1985): 898--922 - "Integral Representations for Distributions of
Symmetric Stochastic Processes", Probability
Theory and Related Fields
**79**(1988): 451--467

- "De Finetti-type theorems: An analytical approach",
Annals of Probability