## Change-Point Problems

*16 Mar 2020 10:07*

Suppose you have a time series which has some (stochastic) property you're interested in, say its expected value or its variance. You think this is usually constant, but that if it does change, it does so abruptly. You would like to know if and when it changes, and perhaps to localize the time when it did. You now have a change-point problem.

In an "offline" setting, where you've collected all you're data and want to analyze it in retrospect, there are fairly straightforward approaches, especially if you are willing to invoke parametric models. We imagine that we've got data , X_2, \ldots X_T \equiv X_{1:T}$, and that if there's no change point then these follow some known probability distribution $p(x_{1:T}; \theta_0)$. If on the other hand there is a break at time $t$, then the distribution is $p(x_{1:t};\theta_0) p(x_{t+1:T};\theta_1)$. (If you really want to condition the after-break stuff on the pre-break stuff go ahead.) The Neyman-Pearson lemma tells us that the optimal test is the likelihood ratio test, so we should locate the break-point at $t$ if the log likelihood ratio \[ \log{\frac{p(x_{1:t};\theta_0) p(x_{t+1:T};\theta_1)}{p(x_{1:T};\theta_0)}} \] is sufficiently large. If neither the pre-change nor post-change parameters are exactly known, we can often still get good results by substituting in their maximum likelihood estimators (or other consistent, efficient estimators), say \[ \log{\frac{p(x_{1:t};\hat{\theta}_0) p(x_{t+1:T};\hat{\theta}_1)}{p(x_{1:T};\tilde{\theta}_0)}} \] If the basic parametric family has $d$ continuous parameters, then the version with the change point at time $t$ has $2d$ continuous parameters, and we'd typically get a $\chi^2_{2d-d} = \chi^2_{d}$ distribution for the distribution of the log likelihood ratio under the null hypothesis of no change. (Notice that the estimate of $\theta_0$ should be different under the null of no change and the alternative of a change at $t$, so I've given the different estimators different accent marks.)

We usually also don't just care about *one* particular possible
change point $t$, but rather want to consider a range as possibilities. This
will often be handled by maximizing the likelihood over different possible
values of $t$. Since that's not a continuous parameter, the usual $\chi^2$
theory doesn't apply.

See also: Time Series; Non-Stationary Forecasting; Filtering and State Estimation

- Recommended:
- Sylvain Arlot and Alain Celisse, "Segmentation of the mean of heteroscedastic data via cross-validation", Statistics and Computing
**21**(2011): 613--632, arxiv:0902.3977 [MATLAB code] - Emily B. Fox, Erik B. Sudderth, Michael I. Jordan, Alan S. Willsky,
"A sticky HDP-HMM with application to speaker diarization",
Annals of Applied Statistics
**5**(2011): 1020--1056, arxiv:0905.2592 - Daniil Ryabko and Boris Ryabko, "Testing Statistical Hypotheses
About Ergodic
Processes", arxiv:0804.0510
[Appears to be the same as their "Nonparametric Statistical Inference for
Ergodic
Processes", IEEE
Transactions on Information Theory
**56**(2010): 1430--1435 - Wenguang Sun, T. Tony Cai, "Large-scale multiple testing under dependence", Journal of the Royal Statistical
Society B
**71**(2008): 393--424 - Albert Vexler, "Martingale Type Statistics Applied to Change Point
Detection", Communications in Statistics - Theory and Methods
**37**(2008): 1207--1224

- Recommended, historical:
- G. A. Barnard, "Control Charts and Stochastic Processes",
Journal of the Royal Statistical Society B
**21**(1959): 239--271 [JSTOR] - E. S. Page, "Continuous Inspection Schemes", Biometrika
**41**(1954): 100--115 [JSTOR. Origin of the famous "cumulative sum" ("CUSUM") procedure for finding breaks.] - A. Wald and J. Wolfowitz, "Optimum Character of the Sequential Probability Ratio Test", Annals of Mathematical Statistics
**19**(1948): 326--339

- To read:
- Alexander Aue and Lajos Horváth, "Structural breaks in time series", Journal of Time Series Analysis
**34**(2013): 1--16 - Boris Brodsky and Boris Darkhovsky, "Sequential change-point detection for mixing random sequences under composite hypotheses", Statistical Inference for Stochastic Processes
**11**(2008): 35--54 - S. Camargo, S. M. Duarte Quieros and C. Anteneodo, "Nonparametric segmentation of nonstationary time series", Physical Review E
**84**(2011): 046702 - Cheng-Der Fuh
- "SPRT and CUSUM in hidden Markov models",
Annals of Statistics
**31**(2003): 942--977 - "Asymptotic operating characteristics of an optimal
change point detection in hidden Markov models", Annals of
Statistics
**32**(2004): 2305--2339 = math.ST/0503682

- "SPRT and CUSUM in hidden Markov models",
Annals of Statistics
- Samir Ben Hariz, Jonathan J. Wylie, Qiang Zhang, "Optimal rate of convergence for nonparametric change-point estimators for nonstationary sequences", Annals of Statistics
**35**(2007): 1802--1826, arxiv:0710.4217 - Heping He and Thomas A. Severini, "Asymptotic properties of maximum likelihood estimators in models with multiple change points", Bernoulli
**16**(2010): 759--779, arxiv:1102.5224 - C. T. Jose, B. Ismail, S. Jayasekhar, "Trend, Growth Rate, and Change Point Analysis: A Data Driven Approach", Communications in Statistics: Simulation and Computation
**37**(2008): 498--506 - Azadeh Khaleghi, Daniil Ryabko, "Multiple Change-Point Estimation in Stationary Ergodic Time-Series", arxiv:1203.1515
- Shiqing Ling, "Testing for change points in time series models and limiting theorems for NED sequences", Annals of Statistics
**35**(2007): 1213--1237, arxiv:0708.2369 - George V. Moustakides, "Sequential change detection
revisited", arxiv:0804.0741
= Annals of Statistics
**36**(2008): 787--807 - Chun Yip Yau and Richard A. Davis, "Likelihood inference for discriminating between long-memory and change-point models", Journal of Time Series Analysis
**33**(2012): 649--664