Filtering, State Estimation, and Other Forms of Signal Processing

Last update: 07 Jul 2025 13:41
First version: 19 April 2006, major expansion 2 August 2019

Etymologically, a "filter" is something you pass a fluid through, especially to purify it. In signal processing, a filter became a device or operation you pass a signal through, especially to remove what is (for your purposes) noise or irrelevancies, thereby purifying it. Its meaning (in this sense) thus diverged: one the one hand, to general transformations of signals, such as preserving their high-frequency ("high-pass") or low-frequency ("low-pass") components; on the other hand, to trying to estimating some underlying true value or state, corrupted by distortion and noise.

Perhaps the classic and most influential "filter" in the latter sense was the one proposed by Norbert Wiener in the 1940s. More specifically, Wiener considered the problem of estimating a state \( S(t) \), following a stationary stochastic processes, from observations of another, related stationary stochastic process \( X(t) \). He restricted himself to linear estimates, of the form \( \hat{S}(t) = \int_{-\infty}^{t-h}{\kappa(t-s) X(s) ds} \) (or the equivalent sum in discrete time), and provided a solution, i.e., a function \( \kappa(u) \), that minimized the expected squared error \( \mathbb{E}[(S(t) - \hat{S}(t))^2] \). Notice that if \( h \) is positive, this combines estimating the state with extrapolating it into the future, while if \( h \) is negative, one is estimating the state with a lag. Wiener's solution did not assume that the true relationship between \( S \) and \( X \) is linear, or that either process shows linear dynamics --- just that the processes are stationary, and one wanted a linear estimator.

Later, in the late 1950s, Kalman, and Kalman and Bucy, tackled the situation where \( S(t) \) follows a linear, Gaussian, discrete-time Markov process, so \( S(t+1) = a S(t) + \epsilon(t) \) for some independent Gaussian noise variables \( \epsilon \), and the observable \( X(t) \) is linearly related to \( S(t) \), \( X(t) = b X(t) + \eta(t) \). They solved for the conditional distribution of \( S(t) \) given \( X(1), \ldots X(t) \), say \( S(t)|X(1:t) \). This is again a Gaussian, whose mean and variance can be expressed in closed form given the parameters, and the mean and variance of \( S(t-1)|X(1:t-1) \). The recursive computation of this conditional distribution came to be called the Kalman filter. The Kalman smoother came to refer to the somewhat more involved computation of \( S(t)|X(1:n) \), \( n > t \) --- that is, going back and refining the estimate of the unobserved state using later observations. This seems to be the root of distinguishing two ways of estimating the states of hidden Markov models, filtering, i.e., getting \( S(t)|X(1:t) \), and smoothing, i.e., getting \( S(t)|X(1:n) \).

The Kalman filter also, unlike the Wiener filter, relied strongly on assumptions about the data-generating process, namely that it really was a linear, Gaussian, hidden-Markov or state-space process. The fragility of the latter assumptions spurred a lot of work, over many years, seeking to either repeat the same pattern under different assumptions, or to use Kalman's solution as some kind of local approximation.

Separately (so far as I can tell from reading the literature), people who were interested in discrete-state Markov chains, observed through noise, considered the same problem of estimating the state from observations. A recursive estimate of \( S(t)|X(1:t) \) came to be called the "forward algorithm". (Applying the exact same ideas to linear-Gaussian HMMs would give you the Kalman filter, though I don't know when people realized that.) The forward algorithm, in turn, served as a component in a more elaborate recursive algorithm for \( S(t)|X(1:n) \), called the "backward algorithm".

The forward algorithm is actually very pretty, and I've just taught it, so I'll sketch the derivation. Assume we've got \( \Pr(S_t|X(1:t)) \), and know \( \Pr(S(t+1)|S(t)) \) (all we need for the dynamics, since the state \( S \) is a Markov process) and \( \Pr(X(t)|S(t)) \) (all we need for the observations, since \( X \) is a hidden-Markov process). First, we extrapolate the state-estimate forward in time: \begin{eqnarray*} \Pr(S(t+1)=s|X(1:t)) & = & \sum_{r}{\Pr(S(t+1)=s, S(t)=r|X(1:t))}\\ & = & \sum_{r}{\Pr(S(t+1)=s|S(t)=r, X(1:t))\Pr(S(t)=r|X(1:t))}\\ & = & \sum_{r}{\Pr(S(t+1)=s|S(t)=r)\Pr(S(t)=r|X(1:t))} \end{eqnarray*} Next, we calculate the predictive distribution: \begin{eqnarray*} \Pr(X(t+1)=x|X(1:t)) & = & \sum_{s}{\Pr(X(t+1)=x|S(t+1)=s, X(1:t))\Pr(S(t+1)=s| X(1:t))}\\ & = & \sum_{s}{\Pr(X(t+1)=x|S(t)=s)\Pr(S(t+1)=s|X(1:t))} \end{eqnarray*} Finally, we use Bayes's rule: \begin{eqnarray*} \Pr(S(t+1)=s|X(1:t+1)) & = & \Pr(S(t+1)=s|X(1:t), X(t+1))\\ & = & \frac{\Pr(S(t+1)=s, X(t+1)|X(1:t))}{\Pr(X(t+1)|X(1:t))}\\ & = & \frac{\Pr(X(t+1)|S(t+1)=s)\Pr(S(t+1)=s|X(1:t))}{\Pr(X(t+1)|X(1:t))}\\ \end{eqnarray*} It's worth noting here that the exact same idea works if \( S(t) \) and/or \( X(t) \) are continuous rather than discrete --- just replace probabilities with probability densities, and sums with integrals as appropriate.

A purely formal solution to finding \( S(t)|X(1:t) \) in arbitrary nonlinear processes, even in continuous time, was worked out by the 1960s; it was, again, a recursion which implemented Bayes's rule. Unfortunately, with continuous states, it's pretty much intractable in general, since you'd need to maintain a probability density over possible states (and then integrate it, twice). This only got people more interested in the special cases which admitted closed forms (like the Kalman filter), or, again, to approximations based on those closed forms.

A minor revolution from the 1990s --- I forget the exact dates and I'm under-motivated to look them up --- was to realize that the exact nonlinear filter could be approximated by Monte Carlo. Look at the way I derived the forward algorithm above. Suppose we didn't know the exact distribution \( \Pr(S(t)|X(1:t)) \), but we did have a sample \( R_1, R_2, \ldots R_m \) drawn from it. We could take each of these and (independently) apply the Markov process's transition to it, to get new states, at time \( t+2 \), say \( S_1, S_2, \ldots S_m \). These values constitute a sample from \( \Pr(S(t+1)|X(1:t)) \). The model tells us \( \Pr(X(t+1)=x|S(t+1)=S_i) \) for each \( S_i \). Averaging those distributions over the samples gives us an approximation to \( \Pr(X(t+1)=x|X(1:t)) \). If we re-sample the \( S_i \)'s with probabilities proportional to \( \Pr(X(t+1)=x(t+1)|S(t+1)=S_i) \), we are left with an approximate sample from \( \Pr(S(t+1)|S(1:t+1)) \). This is the particle filter, the samples being "particles". (The mathematical sciences are not known for consistent, well-developed metaphors.)

All of this presumed (like the Kalman filter) that the parameters of the process where known, and all one needed to estimate was the particular realization of the hidden state \( S(t) \). (Even the Wiener filter presumes a knowledge of the covariance functions, though no more.) But the forward and backward algorithms can be used as components in an algorithm for maximum likelihood estimation of the parameters of an HMM, variously called the "forward-backward", "Baum-Welch" or "expectation-maximization" algorithm. (In fact, the forward algorithm gives us \( \Pr(X(t+1)=x|X(1:t)) \) for each \( t \). Multiplying these together, \( \prod_{t}{\Pr(X(t+1)=x|X(1:t))} \), clearly gives the probability that the model assigns to the whole observed trajectory \( X(1:n) \). Since this is the likelihood of the model (in the sense statisticians use that word), once we've done "filtering", we can use all the common likelihood-based statistical techniques to estimate the model. (Of course, whether likelihood works is another issue...)

Independent component analysis.

Nasir Uddin Ahmed, Introduction to Linear and Nonlinear Filtering for Engineers and Scientists [Clear introductory treatment with not-too-rigorous use of advanced probability theory, which is necessary to really explain what is going on and why it works for nonlinear and/or continuous-time signals.]
Neil Gershenfeld, The Nature of Mathematical Modeling, Part III
Holger Kantz and Thomas Schreiber, Nonlinear Time Series Analysis
Robert Shumway and David Stoffer, Time Series Analysis and Its Applications

David C. Farrow, Maria Jahja, Roni Rosenfeld, Ryan J. Tibshirani, "Kalman Filter, Sensor Fusion, and Constrained Regression: Equivalences and Insights", arxiv:1905.11436 [This is a clever way of writing the Kalman filter as a regression of the current state on the current observables ("sensors") and the extrapolated state from the last time step. As they note, if you have different models for how the state evolves, you could then apply standard variable-selection techniques from regression, like the lasso, to pick one...]

Thomas Bengtsson, Peter Bickel, Bo Li, "Curse-of-dimensionality revisited: Collapse of the particle filter in very large scale systems", pp. 316--334 in Deborah Nolan and Terry Speed (eds.), Probability and Statistics: Essays in Honor of David A. Freedman
A. E. Brockwell, A. L. Rojas and R. E. Kass, "Recursive Bayesian Decoding of Motor Cortical Signals by Particle Filtering", Journal of Neurophysiology 91 (2004): 1899--1907 [Very nice, especially since they've combining data from multiple experiments. It is a little disappointing that they set up a state-space model, but then only use the state to enforce a kind of weak continuity constraint on the decoding, rather than trying to capture the actual computations going on. But I should talk to them about that... Appendix A gives a very clear and compact explanation of particle filtering.]
P. Del Moral and L. Miclo, "Branching and Interacting Particle Systems Approximations of Feynman-Kac Formulae with Applications to Non-Linear Filtering", in J. Azema, M. Emery, M. Ledoux and M. Yor (eds)., Semainaire de Probabilites XXXIV (Springer-Verlag, 2000), pp. 1--145 [Postscript preprint. Looks like a trial run for Del Moral's book.]
Randal Douc, Olivier Cappé and Eric Moulines, "Comparison of Resampling Schemes for Particle Filtering", cs.CE/0507025
Randal Douc and Eric Moulines, "Limit theorems for weighted samples with applications to Sequential Monte Carlo Methods", math.ST/0507042 [With application to state-space filtering]
Arnaud Doucet, Nando De Freitas and Neil Gordon (eds.), Sequential Monte Carlo Methods in Practice
Jayesh H. Kotecha and Petar M. Djuric, "Gaussian Particle Filtering", IEEE Transactions on Signal Processing 51 (2003): 2592--2601
Hans R. Künsch, "Particle filters", Bernoulli 19 (2013): 1391--1403

Olivier Cappé "Online EM Algorithm for Hidden Markov Models", Journal of Computational and Graphical Statistics 20 (2011): 728--749, arxiv:0908.2359

Jochen Bröcker and Ulrich Parlitz, "Analyzing communication schemes using methods from nonlinear filtering," Chaos 13 (2003): 195--208
Pavel Chigansky and Ramon van Handel, "A complete solution to Blackwell's unique ergodicity problem for hidden Markov chains", Annals of Applied Probability 20 (2010): 2318--2345
R. W. R. Darling, "Geometrically Intrinsic Nonlinear Recursive Filters," parts I and II, UCB technical reports 494 and 512
Uri T. Eden, Loren M. Frank, Riccardo Barbieri, Victor Solo and Emery N. Brown, "Dynamic Analysis of Neural Encoding by Point Process Adaptive Filtering", Neural Computation 16 (2005): 971-988 [Interesting development of filtering methods for point processes, beyond the neural application]
Robert J. Elliott, Lakhdar Aggoun and John B. Moore, Hidden Markov Models: Estimation and Control
Gregory L. Eyink, "A Variational Formulation of Optimal Nonlinear Estimation," physics/0011049 [Nice connections between optimal estimation (assuming a known form for the underlying stochastic process), nonequilibrium statistical mechanics, and large deviations theory, leading to tractable-looking numerical schemes.]
Edward Ionides, "Inference and Filtering for Partially Observed Diffusion Processes" [PDF preprint]
M. L. Kleptsyna, A. Le Breton and M.-C. Roubaud, "Parameter Estimation and Optimal Filtering for Fractional Type Stochastic Systems", Statistical Inference for Stochastic Processes 3 (2000): 173--182
V. V. Prelov and E. C. van der Meulen, "On error-free filtering of finite-state singular processes under dependent distortions", Problems of Information Trasmission 49 (2007): 271--279 ["We consider the problem of finding some sufficient conditions under which causal error-free filtering for a singular stationary stochastic process \( X = \left\{ X_n \right\} \) with a finite number of states from noisy observations is possible. For a rather general model of observations where the observable stationary process is absolutely regular with respect to the estimated process \( X \), it is proved (using an information-theoretic approach) that under a natural additional condition, causal error-free (with probability one) filtering is possible."]
Maxim Raginsky, "A variational approach to sampling in diffusion processes", arxiv:2405.00126

Leonard A. McGee and Stanley F. Schmidt, "Discovery of the Kalman Filter as a Practical Tool for Aerospace and Industry", NASA Technical Memorandum 86847 (1985) [How we learned to aim for the stars and/or hit London. Free PDF.]
Maxim Raginsky, "Introduction" to Kalman, "Contributions to the Theory of Optimal Control", forthcoming in David Krakauer (ed.), Foundational Papers in Complexity Science [Thanks to Prof. Raginsky for letting me read this in advance of publication]

Norbert Wiener
- Extrapolation, Interpolation and Smoothing of Stationary Time Series
- Cybernetics

Shinsuke Koyama, Lucia Castellanos Pérez-Bolde, CRS and Robert E. Kass, "Approximate Methods for State-Space Models", Journal of the American Statistical Association 105 (2010): 170--180, arxiv:1004.3476

Lakhdar Aggoun and Robert Elliott, Measure Theory and Filtering: Introduction with Applications
Alan Bain and Dan Crisan, Fundamentals of Stochastic Filtering

Shun-ichi Amari, "Estimating Functions of Independent Component Analysis for Temporally Correlated Signals," Neural Computation 12 (2000): 2083--2107

Rocco Caprio, Juan Kuntz, Samuel Power, Adam M. Johansen, "Error bounds for particle gradient descent, and extensions of the log-Sobolev and Talagrand inequalities", arxiv:2403.02004
Carlos M. Carvalho, Michael S. Johannes, Hedibert F. Lopes, and Nicholas G. Polson, "Particle Learning and Smoothing", Statistical Science 25 (2010): 88--106
Hock Peng Chan and Tze Leung Lai, "A general theory of particle filters in hidden Markov models and some applications", Annals of Statistics 41 (2013): 2877--2904
Alexandre J. Chorin, Xuemin Tu, "Non-Bayesian particle filters", arxiv:0905.2181
Dan Crisan and Joaquin Miguez, "Particle-kernel estimation of the filter density in state-space models", Bernoulli 20 (2014): 1879--1929
Pierre Del Moral
- "Measure-Valued Processes and Interacting Particle Systems. Application to Nonlinear Filtering Problems", The Annals of Applied Probability 8 (1998): 438--495
- Feynman-Kac Formulae: Genealogical and Interacting Particle Systems
Randal Douc, Aurelien Garivier, Eric Moulines, Jimmy Olsson
- "On the Forward Filtering Backward Smoothing particle approximations of the smoothing distribution in general state spaces models", arxiv:0904.0316
- "Sequential Monte Carlo smoothing for general state space hidden Markov models", Annals of Applied Probability 21 (2011): 2109--2145, arxiv:1202.2945
Randal Douc, Eric Moulines, Jimmy Olsson, "Long-term stability of sequential Monte Carlo methods under verifiable conditions", Annals of Applied Probability 24 (2014): 1767--1802, arxiv:1203.6898
Paul Fearnhead, Omiros Papaspiliopoulos, Gareth Roberts, "Particle Filters for Partially Observed Diffusions", arxiv:0710.4345
Arnaud Guillin, Randal Douc and Jamal Najim, "Moderate Deviations for Particle Filtering", math.PR/0401058 = Annals of Applied Probability 15 (2005): 587--614
Dong Guo, Xiaodong Wang and Rong Chen, "New sequential Monte Carlo methods for nonlinear dynamic systems", Statistics and Computing 15 (2005): 135--147
Francois LeGland and Nadia Oudjane
- "Stability and uniform approximation of nonlinear filters using the Hilbert metric and application to particle filters", Annals of Applied Probability 14 (2004): 144--187
- "A roubstification approach to stability and to uniform particle approximation of nonlinear filters: the example of pseudo-mixing signals", Stochastic Processes and Their Applications 106 (2003): 279--316
Patrick Leung, Catherine S. Forbes, Gael M. Martin, Brendan McCabe, "Forecasting observables with particle filters: Any filter will do!", arxiv:1908.07204
Nishanth Lingala, N. Sri Namachchivaya, Nicolas Perkowski, Hoong C. Yeong, "Particle filtering in high-dimensional chaotic systems", arxiv:1204.1360
Isambi S. Mbalawata, Simo Särkkä, "Moment Conditions for Convergence of Particle Filters with Unbounded Importance Weights", arxiv:1403.6585
Jimmy Olsson, Olivier Cappe, Dandal Douc and Eric Moulines, "Sequential Monte Carlo smoothing with application to parameter estimation in non-linear state space models", Bernoulli 14 (2008): 155--179, math.ST/0609514
Jimmy Olsson, Jonas Ströjby, "Particle-based likelihood inference in partially observed diffusion processes using generalised Poisson estimators", arxiv:1008.2886
Anastasia Papavasiliou, "Particle Filters for Multiscale Diffusions", arxiv:0710.5098
Sebastian Reich, "Data Assimilation: The Schrödinger Perspective", Acta Numerica 28 (2019): 635--711, arxiv:1807.08351
Simo Särkkä and Tommi Sottinen, "Application of Girsanov Theorem to Particle Filtering of Discretely Observed Continuous-Time Non-Linear Systems", arxiv:0705.1598
Ramon van Handel, "Uniform Time Average Consistency of Monte Carlo Particle Filters", Stochastic Processes and their Applications 119 (2009): 3835--3861, arxiv:0812.0350

Pavel Chigansky
- "On exponential stability of the nonlinear filter for slowly switching Markov chains", math.PR/0411596
- "An ergodic theorem for filtering with applications to stability", math.PR/0404515
Pavel Chigansky and Robert Liptser
- "Stability of nonlinear filters in nonmixing case", math.PR/0304056 = Annals of Applied Probability 14 (2004): 2038--2056
- "What is always stable in nonlinear filtering?", math.Pr/0504094
G. B. DiMasi and L. Stettner, "Ergodicity of hidden Markov models", Mathematics of Control, Signals, and Systems 17 (2005): 269--296 [Includes consideration of the ergodicity of filters for the HMM]
Randal Douc, Gersende Fort, Eric Moulines and Pierre Priouret, "Forgetting of the initial distribution for Hidden Markov Models", math.ST/0703836
Vladislav B. Tadic and Arnaud Doucet, "Exponential forgetting and geometric ergodicity for optimal filtering in general state-space models", Stochastic Processes and their Applications 115 (2005): 1408--1436

Francis Alexander, Gregy Eyink and Juan Restrepo, "Accelerated Monte-Carlo for Optimal Estimation of Time Series", Journal of Statistical Physics 119 (2005): 1331--1345 [PDF]
T. Bohlin, "Information pattern for linear discrete-time models with stochastic coefficients," IEEE Transactions on Automatic Control 15 (1970): 104--106 [On recursively-computable sufficient statistics]
D. Brigo, B. Hanzon and F. LeGland, "A differential geometric approach to nonlinear filtering: the projection filter," IEEE Transactions on Automatic Control 43 (1998): 247--252 W. Bulatek, M. Lemanczyk and E. Lesigne, "On the Filtering Problem for Stationary Random \(\mathbb{Z}^2 \)-Fields", IEEE Transactions on Information Theory 51 (2005): 3586--3593
Emmanuel Candes and Terence Tao, "Near Optimal Signal Recovery from Random Projections and Universal Encoding Strategies", math.CA/0410542
Alexandre J. Chorin and Paul Krause, "Dimensional reduction for a Bayesian filter", Proceedings of the National Academy of Sciences 101 (2004): 15013--15017 [If I understand their abstract correctly, they're basically saying that you only have to worry about uncertainties along the expanding directions of the dynamics --- uncertainty along the contracting directions is going to go away anyway! Probably it's not that simple...]
Noel Cressie, Tao Shi, and Emily L. Kang, "Fixed Rank Filtering for Spatio-Temporal Data", Journal of Computational and Graphical Statistics (2010) forthcoming
Irene Crimaldi and Luca Pratelli, "Two inequalities for conditional expectations and convergence results for filters", Statistics and Probability Letters 74 (2005): 151--162
Dan Crisan, Alberto Lopez-Yela, Joaquin Miguez, "Stable approximation schemes for optimal filters", arxiv:1809.00301
M. H. A. Davis and I. Marcus, "An Introduction to nonlinear filtering," pp. 53--75 in M. Hazewinkel and J. C. Willems (eds.), Stochastic Systems: The Mathematics of Filtering and Identification and Applications
M. H. A. Davis and P. Varaiya, "Information states for linear stochastic systems," J. Math. Anal. Appl. 37 (1972): 384--402
C. T. J. Dodson and H. Wang, "Iterative Approximation of Statistical Distributions and Relation to Information Geometry", Statistical Inference for Stochastic Processes 4 (2001): 307--318 ["optimal control of stochastic processes through sensor estimation of probability density functions is given a geometric setting via information theory and the information metric."]
F. Douarche, L. Buisson, S. Ciliberto and A. Petrosyan, "A Simple Denoising Technique", physics/0406055
Thorsten Drautzburg, Jesús Fernández-Villaverde, Pablo A. Guerrón-Quintana and Dick Oosthuizen, "Filtering with Limited Information", NBER working paper 32754 (2024)
Gregory L. Eyink and Juan M. Restrepo, "Most Probable Histories for Nonlinear Dynamics: Tracking Climate Transitions", Journal of Statistical Physics 101 (2000): 459--472 [PDF]
Gregory L. Eyink, Juan M. Restrepo and Francis J. Alexander, "A Statistical-Mechanical Approach to Data Assimilation"
1. "Analysis Approximations" [PDF]
2. "Evolution Approximations" [PDF]
3. "Numerical Algorithms" [PDF]
R. M. Fernandez-Alcala, J. Navarro-Moreno, and J. C. Ruiz-Molina, "A Unified Approach to Linear Estimation Problems for Nonstationary Processes", IEEE Transactions on Information Theory 51 (2005): 3594--3601
B. Fristedt, N. Jain and N. Krylov, Filtering and Prediction: A Primer
Ramazan Gencay, Faruk Selcuk and Brandon Whitcher, An Introduction to Wavelets and Other Filtering Methods in Finance and Economics
A. Hannachi, "Probabilitic-based Approach to Optimal Filtering", Physical Review E 61 (2000): 3610--3619
M. Hazewinkel and S. I. Marcus, "On Lie algebras and finite-dimensional filtering" Stochastics 7 (1982): 29--62
M. Hazewinkel and J. C. Willems (eds.), Stochastic Systems: The Mathematics of Filtering and Identification and Applications
Elizabeth Hou, Earl Lawrence, Alfred O. Hero, "Penalized Ensemble Kalman Filters for High Dimensional Non-linear Systems", arxiv:1610.00195
Jeffrey Humpherys, Preston Redd, and Jeremy West, "A Fresh Look at the Kalman Filter", SIAM Review 54 (2012): 801--823
A. Inoue, Y. Nakano and V. Anh, "Linear filtering of systems with memory", math.PR/0407454
Michael T. Johnson and Richard J. Povinelli, "Generalized phase space projection for nonlinear noise reduction", Physica D 201 (2005): 306--317
Kevin Judd, "Failure of maximum likelihood methods for chaotic dynamical systems", Physical Review E 75 (2007): 036210 [He means failure for state estimation, not parameter estimation. I wonder if this isn't linked to the old Fox and Keizer papers about amplifying fluctuations in macroscopic chaos?]
Kevin Judd and Leonard A. Smith
- "Indistinguishable States I. Perfect Model Scenario", Physica D 151 (2001): 125--141
- "Indistinguishable States II. The Imperfect Model Scenario", Physica D 196 (2004): 224--242
Kevin Judd and Thomas Sternler, "Failures of sequential Bayesian filters and the successes of shadowing filters in tracking of nonlinear deterministic and stochastic systems", Physical Review E 7 (2009): 066206
Kay, Fundamentals of Statistical Signal Processing [2 vols.]
R. Khasminskii, "Nonlinear Filtering of Smooth Signals", Stochastics and Dynamics 5 (2005): 27--35
Jin Won Kim, "Duality for nonlinear filtering", arxiv:2207.07709
Sangil Kim, Greg Eyink, Frank Alexander, Juan Restrepo and Greg Johnson, "Ensemble Filtering for Nonlinear Dynamics", Monthly Weather Reveiw 131: 2586--2594 [PDF]
Arthur J. Krener, "The Convergence of the Extended Kalman Filter," math.OC/0212255, also A. Rantzer and C. I. Byrnes (eds.), Directions in Mathematical Systems Theory and Optimiazation (Berlin: Springer-Verlag, 2002): 173--182
H. J. Kushner
- "On the differential equations satisfied by conditional probability densities of Markov processes, with applications," J. SIAM Control A2 (1962): 106--119
- "Approximation to Optimal Nonlinear Filters," IEEE Trans. Auto. Contr. 12 (1967): 546--556
- Probability Methods for Approximations in Stochastic Control and for Elliptic Equations
Rutger-Jan Lange, "Bellman filtering and smoothing for state-space models", Journal of Econometrics 238 (2024): 105632, arxiv:2008.11477
Sylvain Le Corff and Gersende Fort, "Online Expectation Maximization based algorithms for inference in hidden Markov models", arxiv:1108.3968
John M. Lewis, S. Lakshmivarahan and Sudarshan Dhall, Dynamic Data Assimilation: A Least Squares Approach
Robert S. Liptser and Albert N. Shiryaev, Statistics of Random Processes [2 vols., get 2nd edition]
Xiaodong Luo, Jie Zhang and Michael Small, "Optimal phase space projection for noise reduction", nlin.CD/0506011
Andrew J. Majda and Marcus J. Grote, "Explicit off-line criteria for stable accurate time filtering of strongly unstable spatially extended systems", Proceedings of the National Academy of Sciences (USA) 104 (2007): 1124--1129
Andrew J. Majda and John Harlim, Filtering Complex Turbulent Systems
Quentin Malartic, Alban Farchi, Marc Bocquet, "State, global and local parameter estimation using local ensemble Kalman filters: applications to online machine learning of chaotic dynamics", arxiv:2107.11253
W. P. Malcolm, R. J. Elliott and M. R. James, "Risk-Sensitive Filtering and Smoothing for Continuous-Time Markov Processes", IEEE Transactions on Information Theory 51 (2005): 1731--1738
Inés P. Mariño, Joaquín Míguez, and Riccardo Meucci, "Monte Carlo method for adaptively estimating the unknown parameters and the dynamic state of chaotic systems", Physical Review E 79 (2009): 056218
Sanjoy K. Mitter and Nigel J. Newton, "Information and Entropy Flow in the Kalman-Bucy Filter", Journal of Statistical Physics 118 (2005): 145--176 [This looks rather strange, from the abstract, but potentially interesting...]
Jun Morimoto and Kenji Doya, "Reinforcement Learning State Estimator", Neural Computation 19 (2007): 730--756
Jose M. F. Moura and Sanjoy K. Mitter, "Identification and Filtering: Optimal Recursive Maximum Likelihood Approach" [1986 technical report from MIT; PDF preprint --- presumably long since published]
D. Napoletani, C. A. Berenstein, T. Sauer, D. C. Struppa and D. Walnut, "Delay-Coordinates Embeddings as a Data Mining Tool for Denoising Speech Signals", physics/0504155
V. Olshevsky and L. Sakhnovich, "Matched Filtering for Generalized Stationary Processes", IEEE Transactions on Information Theory 51 (2005): 3308--3313
Edward Ott, Brian R. Hunt, Istvan Szunyogh, Matteo Corazza, Eugenia Kalnay, D. J. Patil, and James A. Yorke, "Exploiting Local Low Dimensionality of the Atmospheric Dynamics for Efficient Ensemble Kalman Filtering," physics/0203058
E. Ott, B. R. Hunt, I. Szunyogh, A. V. Zimin, E. J. Kostelich, M. Corazza, E. Kalnay, D.J. Patil and J.A. Yorke, "Estimating the state of large spatio-temporally chaotic systems", Physics Letters A 330 (2004): 365--370
Francescco Paparella, "Filling gaps in chaotic time series", Physics Letters A 346 (2005): 47--53
Debasish Roy and G. Visweswara Rao, Stochastic Dynamics, Filtering and Optimization
W. J. Runggaldier and F. Spizzichino, "Sufficient conditions for finite dimensionality of filters in discrete time: A Laplace transform-based approach," Bernoulli 7 (2001): 211--221
Boris Ryabko, Daniil Ryabko, "Confidence Sets in Time--Series Filtering", arxiv:1012.3059
Daniel Sanz-Alonso, Andrew M. Stuart, Armeen Taeb, "Inverse Problems and Data Assimilation", arxiv:1810.06191
G. Sawitzki, "Finite-dimensional filters in discrete time," Stochastics 5 (1981): 107--114
Alexander Y. Shestopaloff, Radford M. Neal, "MCMC for non-linear state space models using ensembles of latent sequences", arxiv:1305.0320
Steven T. Smith, "Covariance, Subspace, and Intrinsic Cramer-Rao Bounds", IEEE Transactions on Signal Processing forthcoming [Preprint kindly provided by Dr. Smith]
Victor Solo and Xuan Kong, Adaptive Signal Processing Algorithms: Stability and Performance
D. Sornette and K. Ide, "The Kalman-Levy filter," cond-mat/0004369
R. L. Stratonovich
- "Conditional Markov Processes," Theoretical Probability and Its Applications 5 (1960): 156--178
- Conditional Markov Processes and Their Application to the Theory of Optimal Control
Ronen Talmon and Ronald R. Coifman, "Empirical intrinsic geometry for nonlinear modeling and time series filtering", Proceedings of the National Academy of Sciences (USA) 110 (2013): 12535--12540
Xin Thomson Tong, Ramon van Handel, "Ergodicity and stability of the conditional distributions of nondegenerate Markov chains", Annals of Applied Probability 22 (2012): 1495--1540, arxiv:1101.1822
Fernando Tusell, "Kalman Filtering in R", Journal of Statistical Software 39:2 (2011)
Ramon van Handel, "Observability and nonlinear filtering", Probability Theory and Related Fields 145 (2009): 35--74, arxiv:0708.3412
T. Weissman, "How to Filter an `Individual Sequence with Feedback'", IEEE Transactions on Information Theory 54 (2008): 3831--3841
J. C. Willems, "Some remarks on the concept of information state," pp. 285--295 in O. L. R. Jacobs (ed.), Analysis and Optimization of Stochastic Systems
Wan Yang, Jeffrey Shaman, "A simple modification for improving inference of non-linear dynamical systems", arxiv:1403.6804