Forecasting Non-Stationary Processes, and Estimating Their Parameters

Last update: 21 Apr 2025 21:17
First version: 6 March 2011

Some non-stationary processes are in fact easy to forecast: periodic ones, for example, are strictly speaking not stationary. An ergodic Markov chain started far from its invariant distribution is also non-stationary, but easy to predict (it will approach the stationary distribution). Both of these cases are conditionally stationary, which I think is all that's really needed.

What's more interesting is the problem of so to speak really non-stationary processes. It's hard to imagine that there is any way to truly predict an arbitrary non-stationary process. (Basically: as soon as you think you have established a trend-line, the Adversary can always reverse the trend, without creating any problems of consistency with earlier data.) If you can constrain the class of allowable non-stationary processes, however, then something might be possible. Alternately, one might lower expectations, not to actually predicting well, but to predicting with low regret.

I actually have an Idea about using model averaging here, but need to find the time to work on it.

Oren Anava, Elad Hazan, Shie Mannor, Ohad Shamir, "Online Learning for Time Series Prediction", arxiv:1302.6927
S. Caires and J. A. Ferreira, "On the Non-parametric Prediction of Conditionally Stationary Sequences", Statistical Inference for Stochastic Processes 8 (2005): 151--184
R. Dahlhaus, "Fitting Time Series Models to Nonstationary Processes", Annals of Statistics 25 (1997): 1--37
Mark Herbster and Manfred K. Warmuth, "Tracking the Best Expert", Machine Learning 32 (1998): 151--178 [PS version via Dr. Herbster]
Claire Monteleoni and Tommi S. Jaakkola, "Online Learning of Non-stationary Sequences", pp. 1093--1100 in NIPS 2003 (vol. 16) [Figuring out at what rate to switch between experts]
Joaquin Quinonero-Candela, Masashi Sugiyama, Anton Schwaighofer and Neil D. Lawrence (eds.), Dataset Shift in Machine Learning

David T. Frazier and Bonsoo Koo, "Indirect inference for locally stationary models", Journal of Econometrics 223 (2021): 1--27 [Comments/queries]
Elad Hazan and Satyen Kale, "Extracting certainty from uncertainty: regret bounded by variation in costs", Machine Learning 80 (2010): 165--188
Jeremy Zico Kolter and Marcus A. Maloof
- "Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts", Journal of Machine Learning Research 8 (2007): 2755--2790
- "Using Additive Expert Ensembles to Cope with Concept Drift", ICML 2005 [PDF reprint via Kolter]
Wouter M. Koolen and Tim van Erven, "Switching between Hidden Markov Models using Fixed Share", arxiv:1008.4532
Claire Monteleoni, Gavin Schmidt, Shailesh Saroha and Eva Asplund, "Tracking Climate Models", Statistical Analysis and Data Mining 4 (2011): 372--392 [While I list it as a "close-up" in this context, it's probably more important, in terms of its potential impact, than everything else on this page... PDF reprint via Prof. Monteleoni.]
Maxim Raginsky, Roummel F. Marcia, Jorge Silva and Rebecca M. Willett, "Sequential Probability Assignment via Online Convex Programming Using Exponential Families" [ISIT 2009; PDF]
Maxim Raginsky, Rebecca M. Willett, C. Horn, Jorge Silva and Roummel F. Marcia, "Sequential anomaly detection in the presence of noise and limited feedback", IEEE Transactions on Information Theory 58 (2012): 5544--5562, arxiv:0911.2904
Kyupil Yeon, Moon Sup Song, Yongdai Kim, Hosik Choi, Cheolwoo Park, "Model averaging via penalized regression for tracking concept drift", Journal of Computational and Graphical Statistics 19 (2010): 457--473

Abigail Z. Jacobs, Adapting to non-stationarity with growing predictor ensembles [Senior thesis, Northwestern University, 2011]
Michael Spece, Competitive Analysis for Machine Learning and Data Science [Ph.D. thesis, CMU, 2019. In this connection, see specifically ch. 3.]

CRS, Abigail Z. Jacobs, Kristina Lisa Klinkner and Aaron Clauset, "Adapting to Non-stationarity with Growing Expert Ensembles", arxiv:1103.0949

István Berkes, Lajos Horváth and Shiqing Ling, "Estimation in nonstationary random coefficient autoregressive models", Journal of Time Series Analysis 30 (2009): 395--416 ["the unit root problem does not exist in the RCA model"!]
O. Besbes, Y. Gur, A. Zeevi, "Non-stationary Stochastic Optimization", arxiv:1307.5449
Satish T. S. Bukkapatnam and Changqing Cheng, "Forecasting the evolution of nonlinear and nonstationary systems using recurrence-based local Gaussian process models", Physical Review E 82 (2010): 056206
Alexey Chernov, Vladimir Vovk, "Prediction with Advice of Unknown Number of Experts", arxiv:1006.0475
Michael P. Clements and David F. Hendry, Forecasting Non-Stationary Economic Time Series
Rainer Dahlhaus and Wolfgang Polonik, "Empirical spectral processes for locally stationary time series", Bernoulli 15 (2009): 1--39, arxiv:902.1448
Rainer Dahlhaus, Stefan Richter, Wei Biao Wu, "Towards a general theory for nonlinear locally stationary processes", Bernoulli 25 (2019): 1013--1044
Jiti Gao, Bin Peng, Wei Biao Wu, and Yayi Yan, "Time-varying multivariate causal processes", Journal of Econometrics 240 (2024): 105671
Eyal Gofer, Nicolò Cesa-Bianchi, Claudio Gentile, Yishay Mansour, "Regret Minimization for Branching Experts", COLT 2013 / Journal of Machine Learning Research Workshops and Conference Porceedings 30 (2013): 618--638
P. J. Harrison and C. F. Stevens, "A Bayesian Approach to Short-term Forecasting", Operational Research Quarterly 22 (1971): 341-–362
Ching-Kang Ing, Jin-Lung Lin, Shu-Hui Yu, "Toward optimal multistep forecasts in non-stationary autoregressions", Bernoulli 15 (2009): 402--437, arxiv:0906.2266 ["Optimal" assuming that you know you are facing a linear AR model.]
Yan Karklin and Michael S. Lewicki, "A Hierarchical Bayesian Model for Learning Nonlinear Statistical Regularities in Nonstationary Natural Signals", Neural Computation 17 (2005): 397--423
Dennis Kristensen, Young Jun Lee, "Local Polynomial Estimation of Time-Varying Parameters in Nonlinear Models", arxiv:1904.05209
Zudi Lu, Dag Johan Steinskog, Dag Tjostheim and Qiwei Yao, "Adaptively Varying-Coefficient Spatiotemporal Models", Journal of the Royal Statistical Society B 71 (2009): 859--880 [PDF preprint]
Alexander O'Neill, Marcus Hutter, Wen Shao, Peter Sunehag, "Adaptive Context Tree Weighting", arxiv:1201.2056
Joshua W. Robinson, Alexander J. Hartemink, "Learning Non-Stationary Dynamic Bayesian Networks", Journal of Machine Learning Research 11 (2010): 3647--3680
Masashi Sugiyama and Motoaki Kawanabe, Machine Learning in Non-Stationary Environments: Introduction to Covariate Shift Adaptation
Nina Vaits, Edward Moroshko, Koby Crammer, "Second-Order Non-Stationary Online Learning for Regression", arxiv:1303.0140
P. F. Verdes, P. M. Granitto and H. A. Ceccatto, "Overembedding Method for Modeling Nonstationary Systems", Physical Review Letters 96 (2006): 118701
Michael Vogt and Holger Dette, "Detecting gradual changes in locally stationary processes", Annals of Statistics 43 (2015): 713--740, arxiv:1310.4678 and/or arxiv:1403.3808
Qiying Wang and Peter C. B. Phillips, "A specification test for nonlinear nonstationary models", Annals of Statistics 40 (2012): 727--758
Ou Zhao, Michael Woodroofe, "Estimating a monotone trend", arxiv:0812.3188
Shuheng Zhou, John Lafferty, Larry Wasserman, "Time Varying Undirected Graphs", arxiv:0802.2758

CRS + co-conspirators to be named later, "This Time Is Different"