## Filtering, State Estimation, and Other Forms of Signal Processing

*31 Oct 2023 20:40*

Etymologically, a "filter" is something you pass a fluid through, especially to purify it. In signal processing, a filter became a device or operation you pass a signal through, especially to remove what is (for your purposes) noise or irrelevancies, thereby purifying it. Its meaning (in this sense) thus diverged: one the one hand, to general transformations of signals, such as preserving their high-frequency ("high-pass") or low-frequency ("low-pass") components; on the other hand, to trying to estimating some underlying true value or state, corrupted by distortion and noise.

Perhaps the classic and most influential "filter" in the latter sense was
the one proposed by Norbert Wiener in the 1940s.
More specifically, Wiener considered the problem of estimating a state \( S(t)
\), following a stationary stochastic processes, from observations of another,
related stationary stochastic process \( X(t) \). He restricted himself to
linear estimates, of the form \( \hat{S}(t) = \int_{-\infty}^{t-h}{\kappa(t-s)
X(s) ds} \) (or the equivalent sum in discrete time), and provided a solution,
i.e., a function \( \kappa(u) \), that minimized the expected squared error \(
\mathbb{E}[(S(t) - \hat{S}(t))^2] \). Notice that if \( h \) is positive, this
combines estimating the state with extrapolating it into the future, while if
\( h \) is negative, one is estimating the state with a lag. Wiener's solution
did *not* assume that the true relationship between \( S \) and \( X \)
is linear, or that either process shows linear dynamics --- just that the
processes are stationary, and one wanted a linear estimator.

Later, in the late 1950s, Kalman, and Kalman and Bucy, tackled the situation
where \( S(t) \) follows a linear, Gaussian,
discrete-time Markov process, so \( S(t+1) = a S(t) +
\epsilon(t) \) for some independent Gaussian noise variables \( \epsilon \), and the
observable \( X(t) \) is linearly related to \( S(t) \), \( X(t) = b X(t) + \eta(t) \).
They solved for the conditional distribution of \( S(t) \) given \( X(1), \ldots
X(t) \), say \( S(t)|X(1:t) \). This is again a Gaussian, whose mean and variance
can be expressed in closed form given the parameters, and the mean and variance
of \( S(t-1)|X(1:t-1) \). The recursive computation of this conditional
distribution came to be called the **Kalman filter**. The
Kalman **smoother** came to refer to the somewhat more involved
computation of \( S(t)|X(1:n) \), \( n > t \) --- that is, going back and refining the
estimate of the unobserved state using later observations. This seems to be
the root of distinguishing two ways of estimating the states
of hidden Markov models, filtering, i.e., getting
\( S(t)|X(1:t) \), and smoothing, i.e., getting \( S(t)|X(1:n) \).

The Kalman filter also, unlike the Wiener filter, relied strongly on assumptions about the data-generating process, namely that it really was a linear, Gaussian, hidden-Markov or state-space process. The fragility of the latter assumptions spurred a lot of work, over many years, seeking to either repeat the same pattern under different assumptions, or to use Kalman's solution as some kind of local approximation.

Separately (so far as I can tell from reading the literature), people who were interested in discrete-state Markov chains, observed through noise, considered the same problem of estimating the state from observations. A recursive estimate of \( S(t)|X(1:t) \) came to be called the "forward algorithm". (Applying the exact same ideas to linear-Gaussian HMMs would give you the Kalman filter, though I don't know when people realized that.) The forward algorithm, in turn, served as a component in a more elaborate recursive algorithm for \( S(t)|X(1:n) \), called the "backward algorithm".

The forward algorithm is actually very pretty, and I've just taught it, so I'll sketch the derivation. Assume we've got \( \Pr(S_t|X(1:t)) \), and know \( \Pr(S(t+1)|S(t)) \) (all we need for the dynamics, since the state \( S \) is a Markov process) and \( \Pr(X(t)|S(t)) \) (all we need for the observations, since \( X \) is a hidden-Markov process). First, we extrapolate the state-estimate forward in time: \begin{eqnarray*} \Pr(S(t+1)=s|X(1:t)) & = & \sum_{r}{\Pr(S(t+1)=s, S(t)=r|X(1:t))}\\ & = & \sum_{r}{\Pr(S(t+1)=s|S(t)=r, X(1:t))\Pr(S(t)=r|X(1:t))}\\ & = & \sum_{r}{\Pr(S(t+1)=s|S(t)=r)\Pr(S(t)=r|X(1:t))} \end{eqnarray*} Next, we calculate the predictive distribution: \begin{eqnarray*} \Pr(X(t+1)=x|X(1:t)) & = & \sum_{s}{\Pr(X(t+1)=x|S(t+1)=s, X(1:t))\Pr(S(t+1)=s| X(1:t))}\\ & = & \sum_{s}{\Pr(X(t+1)=x|S(t)=s)\Pr(S(t+1)=s|X(1:t))} \end{eqnarray*} Finally, we use Bayes's rule: \begin{eqnarray*} \Pr(S(t+1)=s|X(1:t+1)) & = & \Pr(S(t+1)=s|X(1:t), X(t+1))\\ & = & \frac{\Pr(S(t+1)=s, X(t+1)|X(1:t))}{\Pr(X(t+1)|X(1:t))}\\ & = & \frac{\Pr(X(t+1)|S(t+1)=s)\Pr(S(t+1)=s|X(1:t))}{\Pr(X(t+1)|X(1:t))}\\ \end{eqnarray*} It's worth noting here that the exact same idea works if \( S(t) \) and/or \( X(t) \) are continuous rather than discrete --- just replace probabilities with probability densities, and sums with integrals as appropriate.

A purely formal solution to finding \( S(t)|X(1:t) \) in arbitrary nonlinear processes, even in continuous time, was worked out by the 1960s; it was, again, a recursion which implemented Bayes's rule. Unfortunately, with continuous states, it's pretty much intractable in general, since you'd need to maintain a probability density over possible states (and then integrate it, twice). This only got people more interested in the special cases which admitted closed forms (like the Kalman filter), or, again, to approximations based on those closed forms.

A minor revolution from the 1990s --- I forget the exact dates and I'm
under-motivated to look them up --- was to realize that the exact nonlinear
filter could be approximated by Monte Carlo. Look at the way I derived the
forward algorithm above. Suppose we didn't know the exact distribution
\( \Pr(S(t)|X(1:t)) \), but we did have a sample \( R_1, R_2, \ldots R_m \) drawn from
it. We could take each of these and (independently) apply the Markov process's
transition to it, to get new states, at time \( t+2 \), say \( S_1, S_2, \ldots S_m \).
These values constitute a sample from \( \Pr(S(t+1)|X(1:t)) \). The model tells us
\( \Pr(X(t+1)=x|S(t+1)=S_i) \) for each \( S_i \). Averaging those distributions over
the samples gives us an approximation to \( \Pr(X(t+1)=x|X(1:t)) \). If we
re-sample the \( S_i \)'s with probabilities proportional to
\( \Pr(X(t+1)=x(t+1)|S(t+1)=S_i) \), we are left with an approximate sample from
\( \Pr(S(t+1)|S(1:t+1)) \). This is the **particle filter**, the samples
being "particles". (The mathematical sciences are not known for consistent,
well-developed metaphors.)

All of this presumed (like the Kalman filter) that the parameters of the
process where known, and all one needed to estimate was the particular
realization of the hidden state \( S(t) \). (Even the Wiener filter presumes a
knowledge of the covariance functions, though no more.) But the forward and
backward algorithms can be used as components in an algorithm for maximum
likelihood estimation of the parameters of an HMM, variously called the
"forward-backward", "Baum-Welch" or "expectation-maximization" algorithm. (In
fact, the forward algorithm gives us \( \Pr(X(t+1)=x|X(1:t)) \) for each \( t \).
Multiplying these together, \( \prod_{t}{\Pr(X(t+1)=x|X(1:t))} \), clearly gives
the probability that the model assigns to the whole observed trajectory
\( X(1:n) \). Since this is the "likelihood" of the model (in the sense
statisticians use that word), once we've done "filtering", we can use all the
common likelihood-based statistical techniques to estimate the model.
(Of course, whether likelihood *works* is another issue...)

*Independent component analysis.*

- Recommended, big picture:
- Nasir Uddin Ahmed, Introduction to Linear and Nonlinear Filtering for Engineers and Scientists [Clear introductory treatment with not-too-rigorous use of advanced probability theory, which is necessary to really explain what is going on and why it works for nonlinear and/or continuous-time signals.]
- R. W. R. Darling, Nonlinear Filtering --- Online Survey
- Neil Gershenfeld, The Nature of Mathematical Modeling, Part III
- Holger Kantz and Thomas Schreiber, Nonlinear Time Series Analysis
- Robert Shumway and David Stoffer, Time Series Analysis and Its Applications

- Recommended, closeups:
- Thomas Bengtsson, Peter Bickel, Bo Li, "Curse-of-dimensionality revisited: Collapse of the particle filter in very large scale systems", pp. 316--334 in Deborah Nolan and Terry Speed (eds.), Probability and Statistics: Essays in Honor of David A. Freedman
- Jochen Bröcker and Ulrich Parlitz, "Analyzing communication
schemes using methods from nonlinear filtering," Chaos
**13**(2003): 195--208 - A. E. Brockwell, A. L. Rojas and R. E. Kass, "Recursive
Bayesian Decoding of Motor Cortical Signals by Particle Filtering",
Journal of
Neurophysiology
**91**(2004): 1899--1907 [Very nice, especially since they've combining data from multiple experiments. It is a*little*disappointing that they set up a state-space model, but then only use the state to enforce a kind of weak continuity constraint on the decoding, rather than trying to capture the actual computations going on. But I should talk to them about that... Appendix A gives a very clear and compact explanation of particle filtering.] - Olivier Cappé "Online EM Algorithm for Hidden Markov
Models", Journal of Computational and Graphical Statistics
**20**(2011): 728--749, arxiv:0908.2359 - Pavel Chigansky and Ramon van Handel, "A complete solution to Blackwell's unique ergodicity problem for hidden Markov chains", Annals of
Applied Probability
**20**(2010): 2318--2345 - R. W. R. Darling, "Geometrically Intrinsic Nonlinear Recursive Filters," parts I and II, UCB technical reports 494 and 512
- P. Del Moral and L. Miclo, "Branching and Interacting Particle Systems Approximations of Feynman-Kac Formulae with Applications to Non-Linear Filtering", in J. Azema, M. Emery, M. Ledoux and M. Yor (eds)., Semainaire de Probabilites XXXIV (Springer-Verlag, 2000), pp. 1--145 [Postscript preprint. Looks like a trial run for Del Moral's book.]
- Randal Douc, Olivier Cappé and Eric Moulines, "Comparison of Resampling Schemes for Particle Filtering", cs.CE/0507025
- Arnaud Doucet, Nando De Freitas and Neil Gordon (eds.), Sequential Monte Carlo Methods in Practice
- Uri T. Eden, Loren M. Frank, Riccardo Barbieri, Victor Solo and
Emery N. Brown, "Dynamic Analysis of Neural Encoding by Point Process Adaptive
Filtering", Neural
Computation
**16**(2005): 971-988 [Interesting development of filtering methods for point processes, beyond the neural application] - Robert J. Elliott, Lakhdar Aggoun and John B. Moore, Hidden Markov Models: Estimation and Control
- Gregory L. Eyink, "A Variational Formulation of Optimal Nonlinear Estimation," physics/0011049 [Nice connections between optimal estimation (assuming a known form for the underlying stochastic process), nonequilibrium statistical mechanics, and large deviations theory, leading to tractable-looking numerical schemes.]
- David C. Farrow, Maria Jahja, Roni Rosenfeld, Ryan J. Tibshirani, "Kalman Filter, Sensor Fusion, and Constrained Regression: Equivalences and Insights", arxiv:1905.11436 [This is a clever way of writing the Kalman filter as a regression of the current state on the current observables ("sensors") and the extrapolated state from the last time step. As they note, if you have different models for how the state evolves, you could then apply standard variable-selection techniques from regression, like the lasso, to pick one...]
- Edward Ionides, "Inference and Filtering for Partially Observed Diffusion Processes" [PDF preprint]
- Jayesh H. Kotecha and Petar M. Djuric, "Gaussian Particle
Filtering", IEEE
Transactions on Signal Processing
**51**(2003): 2592--2601 - M. L. Kleptsyna, A. Le Breton and M.-C. Roubaud, "Parameter
Estimation and Optimal Filtering for Fractional Type Stochastic
Systems", Statistical Inference for Stochastic Processes
**3**(2000): 173--182 - Hans R. Künsch, "Particle
filters", Bernoulli
**19**(2013): 1391--1403 - V. V. Prelov and E. C. van der Meulen, "On error-free filtering of finite-state singular processes under dependent distortions",
Problems of
Information Trasmission
**49**(2007): 271--279 ["We consider the problem of finding some sufficient conditions under which causal error-free filtering for a singular stationary stochastic process X = {X_{n}} with a finite number of states from noisy observations is possible. For a rather general model of observations where the observable stationary process is absolutely regular with respect to the estimated process X, it is proved (using an information-theoretic approach) that under a natural additional condition, causal error-free (with probability one) filtering is possible."]

- Recommended, historical analyses:
- Leonard A. McGee and Stanley F. Schmidt, "Discovery of the Kalman Filter as a Practical Tool for Aerospace and Industry", NASA Technical Memorandum 86847 (1985) [How we learned to aim for the stars and/or hit London. Free PDF.]
- Maxim Raginsky, "Introduction" to Kalman, "Contributions to the Theory of Optimal Control", forthcoming in David Krakauer (ed.), Foundational Papers in Complexity Science [Thanks to Prof. Raginsky for letting me read this in advance of publication]

- Recommended, historical interest:
- Norbert Wiener
- Extrapolation, Interpolation and Smoothing of Stationary Time Series
- Cybernetics

- Modesty forbids me to recommend:
- Shinsuke Koyama, Lucia Castellanos Pérez-Bolde, CRS and Robert E. Kass, "Approximate Methods for State-Space Models", Journal of the American Statistical Association
**105**(2010): 170--180, arxiv:1004.3476

- To read:
- Lakhdar Aggoun and Robert Elliott, Measure Theory and Filtering: Introduction with Applications
- Francis Alexander, Gregy Eyink and Juan Restrepo, "Accelerated
Monte-Carlo for Optimal Estimation of Time Series", Journal of Statistical
Physics
**119**(2005): 1331--1345 [PDF] - Shun-ichi Amari, "Estimating Functions of Independent Component
Analysis for Temporally Correlated Signals," Neural Computation
**12**(2000): 2083--2107 - Alan Bain and Dan Crisan, Fundamentals of Stochastic Filtering
- T. Bohlin, "Information pattern for linear discrete-time models
with stochastic coefficients," IEEE Transactions on Automatic
Control
**15**(1970): 104--106 [On recursively-computable sufficient statistics] - D. Brigo, B. Hanzon and F. LeGland, "A differential geometric
approach to nonlinear filtering: the projection filter," IEEE
Transactions on Automatic Control
**43**(1998): 247--252W. Bulatek, M. Lemanczyk and E. Lesigne, "On the Filtering Problem for Stationary Random \(\mathbb{Z}^2 \)-Fields", IEEE Transactions on Information Theory **51**(2005): 3586--3593 - Emmanuel Candes and Terence Tao, "Near Optimal Signal Recovery from Random Projections and Universal Encoding Strategies", math.CA/0410542
- Carlos M. Carvalho, Michael S. Johannes, Hedibert F. Lopes, and Nicholas G. Polson, "Particle Learning and Smoothing", Statistical Science
**25**(2010): 88--106 - Hock Peng Chan and Tze Leung Lai, "A general theory of particle filters in hidden Markov models and some applications",
Annals of Statistics
**41**(2013): 2877--2904 - Pavel Chigansky
- "On exponential stability of the nonlinear filter for slowly switching Markov chains", math.PR/0411596
- "An ergodic theorem for filtering with applications to stability", math.PR/0404515

- Pavel Chigansky and Robert Liptser
- "Stability of nonlinear filters
in nonmixing case", math.PR/0304056 = Annals of Applied
Probability
**14**(2004): 2038--2056 - "What is always stable in nonlinear filtering?", math.Pr/0504094

- "Stability of nonlinear filters
in nonmixing case", math.PR/0304056 = Annals of Applied
Probability
- Alexandre J. Chorin and Paul Krause, "Dimensional reduction for a
Bayesian filter", Proceedings of the
National Academy of Sciences
**101**(2004): 15013--15017 [If I understand their abstract correctly, they're basically saying that you only have to worry about uncertainties along the expanding directions of the dynamics --- uncertainty along the contracting directions is going to go away anyway! Probably it's not that simple...] - Alexandre J. Chorin, Xuemin Tu, "Non-Bayesian particle filters", arxiv:0905.2181
- Noel Cressie, Tao Shi, and Emily L. Kang, "Fixed Rank Filtering for Spatio-Temporal Data", Journal of Computational and Graphical Statistics (2010) forthcoming
- Irene Crimaldi and Luca Pratelli, "Two inequalities for conditional
expectations and convergence results for filters", Statistics and
Probability Letters
**74**(2005): 151--162 - Dan Crisan, Alberto Lopez-Yela, Joaquin Miguez, "Stable approximation schemes for optimal filters", arxiv:1809.00301
- Dan Crisan and Joaquin Miguez, "Particle-kernel estimation of the filter density in state-space models", Bernoulli
**20**(2014): 1879--1929 - M. H. A. Davis and I. Marcus, "An Introduction to nonlinear filtering," pp. 53--75 in M. Hazewinkel and J. C. Willems (eds.), Stochastic Systems: The Mathematics of Filtering and Identification and Applications
- M. H. A. Davis and P. Varaiya, "Information states for linear
stochastic systems," J. Math. Anal. Appl.
**37**(1972): 384--402 - Pierre Del Moral
- "Measure-Valued Processes and Interacting Particle
Systems. Application to Nonlinear Filtering Problems", The Annals of
Applied Probability
**8**(1998): 438--495 - Feynman-Kac Formulae: Genealogical and
Interacting Particle Systems [This looks
*really, really cool*]

- "Measure-Valued Processes and Interacting Particle
Systems. Application to Nonlinear Filtering Problems", The Annals of
Applied Probability
- G. B. DiMasi and L. Stettner, "Ergodicity of hidden Markov models",
Mathematics of
Control, Signals, and Systems
**17**(2005): 269--296 [Includes consideration of the ergodicity of filters for the HMM] - C. T. J. Dodson and H. Wang, "Iterative Approximation of
Statistical Distributions and Relation to Information Geometry",
Statistical Inference
for Stochastic Processes
**4**(2001): 307--318 ["optimal control of stochastic processes through sensor estimation of probability density functions is given a geometric setting via information theory and the information metric."] - F. Douarche, L. Buisson, S. Ciliberto and A. Petrosyan, "A Simple Denoising Technique", physics/0406055
- Randal Douc, Gersende Fort, Eric Moulines and Pierre Priouret, "Forgetting of the initial distribution for Hidden Markov Models", math.ST/0703836
- Randal Douc, Aurelien Garivier, Eric Moulines, Jimmy Olsson
- "On the Forward Filtering Backward Smoothing particle approximations of the smoothing distribution in general state spaces models", arxiv:0904.0316
- "Sequential Monte Carlo smoothing for general state space hidden Markov models", Annals of Applied Probability
**21**(2011): 2109--2145, arxiv:1202.2945

- Randal Douc and Eric Moulines, "Limit theorems for weighted samples with applications to Sequential Monte Carlo Methods", math.ST/0507042 [With application to state-space filtering]
- Randal Douc, Eric Moulines, Jimmy Olsson, "Long-term stability of sequential Monte Carlo methods under verifiable conditions", Annals of Applied Probability
**24**(2014): 1767--1802, arxiv:1203.6898 - Gregory L. Eyink and Juan M. Restrepo, "Most Probable Histories for
Nonlinear Dynamics: Tracking Climate Transitions", Journal of
Statistical Physics
**101**(2000): 459--472 [PDF] - Gregory L. Eyink, Juan M. Restrepo and Francis J. Alexander, "A Statistical-Mechanical Approach to Data Assimilation"
- Paul Fearnhead, Omiros Papaspiliopoulos, Gareth Roberts, "Particle Filters for Partially Observed Diffusions", arxiv:0710.4345
- R. M. Fernandez-Alcala, J. Navarro-Moreno, and J. C. Ruiz-Molina,
"A Unified Approach to Linear Estimation Problems for Nonstationary
Processes", IEEE
Transactions on Information Theory
**51**(2005): 3594--3601 - B. Fristedt, N. Jain and N. Krylov, Filtering and Prediction: A Primer
- Ramazan Gencay, Faruk Selcuk and Brandon Whitcher, An Introduction to Wavlets and Other Filtering Methods in Finance and Economics
- Arnaud Guillin, Randal Douc and Jamal Najim, "Moderate Deviations
for Particle Filtering", math.PR/0401058 = Annals of Applied
Probability
**15**(2005): 587--614 - Dong Guo, Xiaodong Wang and Rong Chen, "New sequential Monte Carlo
methods for nonlinear dynamic systems", Statistics and
Computing
**15**(2005): 135--147 - A. Hannachi, "Probabilitic-based Approach to Optimal Filtering",
Physical Review E
**61**(2000): 3610--3619 - M. Hazewinkel and S. I. Marcus, "On Lie algebras and
finite-dimensional filtering" Stochastics
**7**(1982): 29--62 - M. Hazewinkel and J. C. Willems (eds.), Stochastic Systems: The Mathematics of Filtering and Identification and Applications
- A. Inoue, Y. Nakano and V. Anh, "Linear filtering of systems with memory", math.PR/0407454
- Michael T. Johnson and Richard J. Povinelli, "Generalized phase
space projection for nonlinear noise reduction", Physica
D
**201**(2005): 306--317 - Kevin Judd, "Failure of maximum likelihood methods for chaotic
dynamical
systems", Physical
Review E
**75**(2007): 036210 [He means failure for state estimation, not parameter estimation. I wonder if this isn't linked to the old Fox and Keizer papers about amplifying fluctuations in macroscopic chaos?] - Kevin Judd and Leonard A. Smith
- "Indistinguishable States I. Perfect Model Scenario", Physica
D
**151**(2001): 125--141 - "Indistinguishable States II. The Imperfect Model
Scenario", Physica
D
**196**(2004): 224--242

- "Indistinguishable States I. Perfect Model Scenario", Physica
D
- Kevin Judd and Thomas Sternler, "Failures of sequential Bayesian filters and the successes of shadowing filters in tracking of nonlinear deterministic and stochastic systems", Physical
Review E
**7**(2009): 066206 - Kay, Fundamentals of Statistical Signal Processing [2 vols.]
- R. Khasminskii, "Nonlinear Filtering of Smooth Signals",
Stochastics and
Dynamics
**5**(2005): 27--35 - Jin Won Kim, "Duality for nonlinear filtering", arxiv:2207.07709
- Sangil Kim, Greg Eyink, Frank Alexander, Juan Restrepo and Greg
Johnson, "Ensemble Filtering for Nonlinear Dynamics", Monthly Weather
Reveiw
**131**: 2586--2594 [PDF] - Arthur J. Krener, "The Convergence of the Extended Kalman Filter," math.OC/0212255, also A. Rantzer and C. I. Byrnes (eds.), Directions in Mathematical Systems Theory and Optimiazation (Berlin: Springer-Verlag, 2002): 173--182
- H. J. Kushner
- "On the differential equations satisfied by conditional
probability densities of Markov processes, with applications," J. SIAM
Control
**A2**(1962): 106--119 - "Approximation to Optimal Nonlinear Filters," IEEE
Trans. Auto. Contr.
**12**(1967): 546--556 - Probability Methods for Approximations in Stochastic Control and for Elliptic Equations

- "On the differential equations satisfied by conditional
probability densities of Markov processes, with applications," J. SIAM
Control
- Sylvain Le Corff and Gersende Fort, "Online Expectation Maximization based algorithms for inference in hidden Markov models", arxiv:1108.3968
- Francois LeGland and Nadia Oudjane
- "Stability and uniform approximation of nonlinear filters
using the Hilbert metric and application to particle filters",
Annals of Applied
Probability
**14**(2004): 144--187 - "A roubstification approach to
stability and to uniform particle approximation of nonlinear filters: the
example of pseudo-mixing signals", Stochastic
Processes and Their Applications
**106**(2003): 279--316

- "Stability and uniform approximation of nonlinear filters
using the Hilbert metric and application to particle filters",
Annals of Applied
Probability
- John M. Lewis, S. Lakshmivarahan and Sudarshan Dhall, Dynamic Data Assimilation: A Least Squares Approach
- Nishanth Lingala, N. Sri Namachchivaya, Nicolas Perkowski, Hoong C. Yeong, "Particle filtering in high-dimensional chaotic systems", arxiv:1204.1360
- Robert S. Liptser and Albert N. Shiryaev, Statistics of Random Processes [2 vols., get 2nd edition]
- Xiaodong Luo, Jie Zhang and Michael Small, "Optimal phase space projection for noise reduction", nlin.CD/0506011
- Andrew J. Majda and Marcus J. Grote, "Explicit off-line criteria
for stable accurate time filtering of strongly unstable spatially extended
systems",
Proceedings of the
National Academy of Sciences (USA)
**104**(2007): 1124--1129 - Andrew J. Majda and John Harlim, Filtering Complex Turbulent Systems
- W. P. Malcolm, R. J. Elliott and M. R. James, "Risk-Sensitive
Filtering and Smoothing for Continuous-Time Markov Processes", IEEE Transactions on
Information Theory
**51**(2005): 1731--1738 - Inés P. Mariño, Joaquín Míguez, and Riccardo Meucci, "Monte Carlo method for adaptively estimating the unknown parameters and the dynamic state of chaotic systems", Physical Review E
**79**(2009): 056218 - Isambi S. Mbalawata, Simo Särkkä, "Moment Conditions for Convergence of Particle Filters with Unbounded Importance Weights", arxiv:1403.6585
- Sanjoy K. Mitter and Nigel J. Newton, "Information and Entropy Flow
in the Kalman-Bucy Filter", Journal of Statistical
Physics
**118**(2005): 145--176 [This looks rather strange, from the abstract, but potentially interesting...] - Jun Morimoto and Kenji Doya, "Reinforcement Learning State
Estimator",
Neural
Computation
**19**(2007): 730--756 - Jose M. F. Moura and Sanjoy K. Mitter, "Identification and Filtering: Optimal Recursive Maximum Likelihood Approach" [1986 technical report from MIT; PDF preprint --- presumably long since published]
- D. Napoletani, C. A. Berenstein, T. Sauer, D. C. Struppa and D. Walnut, "Delay-Coordinates Embeddings as a Data Mining Tool for Denoising Speech Signals", physics/0504155
- V. Olshevsky and L. Sakhnovich, "Matched Filtering for Generalized
Stationary Processes", IEEE Transactions on
Information Theory
**51**(2005): 3308--3313 - Jimmy Olsson, Olivier Cappe, Dandal Douc and Eric Moulines,
"Sequential Monte Carlo smoothing with application to parameter estimation in
non-linear state space
models",
Bernoulli
**14**(2008): 155--179, math.ST/0609514 - Jimmy Olsson, Jonas Ströjby, "Particle-based likelihood inference in partially observed diffusion processes using generalised Poisson estimators", arxiv:1008.2886
- Edward Ott, Brian R. Hunt, Istvan Szunyogh, Matteo Corazza, Eugenia Kalnay, D. J. Patil, and James A. Yorke, "Exploiting Local Low Dimensionality of the Atmospheric Dynamics for Efficient Ensemble Kalman Filtering," physics/0203058
- E. Ott, B. R. Hunt, I. Szunyogh, A. V. Zimin, E. J. Kostelich,
M. Corazza, E. Kalnay, D.J. Patil and J.A. Yorke, "Estimating the state of
large spatio-temporally chaotic systems", Physics Letters
A
**330**(2004): 365--370 - Francescco Paparella, "Filling gaps in chaotic time series",
Physics
Letters A
**346**(2005): 47--53 - Anastasia Papavasiliou, "Particle Filters for Multiscale Diffusions", arxiv:0710.5098
- Debasish Roy and G. Visweswara Rao, Stochastic Dynamics, Filtering and Optimization
- W. J. Runggaldier and F. Spizzichino, "Sufficient conditions for
finite dimensionality of filters in discrete time: A Laplace transform-based
approach," Bernoulli
**7**(2001): 211--221 - Boris Ryabko, Daniil Ryabko, "Confidence Sets in Time--Series Filtering", arxiv:1012.3059
- Daniel Sanz-Alonso, Andrew M. Stuart, Armeen Taeb, "Inverse Problems and Data Assimilation", arxiv:1810.06191
- Simo Särkkä and Tommi Sottinen, "Application of Girsanov Theorem to Particle Filtering of Discretely Observed Continuous-Time Non-Linear Systems", arxiv:0705.1598
- G. Sawitzki, "Finite-dimensional filters in discrete time,"
Stochastics
**5**(1981): 107--114 - Alexander Y. Shestopaloff, Radford M. Neal, "MCMC for non-linear state space models using ensembles of latent sequences", arxiv:1305.0320
- Steven T. Smith, "Covariance, Subspace, and Intrinsic Cramer-Rao
Bounds", IEEE Transactions on Signal
Processing
**forthcoming**[Preprint kindly provided by Dr. Smith] - Victor Solo and Xuan Kong, Adaptive Signal Processing Algorithms: Stability and Performance
- D. Sornette and K. Ide, "The Kalman-Levy filter," cond-mat/0004369
- R. L. Stratonovich
- "Conditional Markov Processes," Theoretical
Probability and Its Applications
**5**(1960): 156--178 - Conditional Markov Processes and Their Application to the Theory of Optimal Control

- "Conditional Markov Processes," Theoretical
Probability and Its Applications
- Vladislav B. Tadic and Arnaud Doucet, "Exponential forgetting and
geometric ergodicity for optimal filtering in general state-space models", Stochastic Processes
and their Applications
**115**(2005): 1408--1436 - Ronen Talmon and Ronald R. Coifman, "Empirical intrinsic geometry for nonlinear modeling and time series filtering", Proceedings of the National Academy of Sciences (USA)
**110**(2013): 12535--12540 - Xin Thomson Tong, Ramon van Handel, "Ergodicity and stability of the conditional distributions of nondegenerate Markov chains",
Annals of Applied Probability
**22**(2012): 1495--1540, arxiv:1101.1822 - Fernando Tusell, "Kalman Filtering in R",
Journal of Statistical Software
**39:2**(2011) - Ramon van Handel
- "Observability and nonlinear filtering",
Probability Theory and Related Fields
**145**(2009): 35--74, arxiv:0708.3412 - "Uniform Time Average Consistency of Monte Carlo Particle Filters", Stochastic Processes and their Applications
**119**(2009): 3835--3861, arxiv:0812.0350

- "Observability and nonlinear filtering",
Probability Theory and Related Fields
- T. Weissman, "How to Filter an `Individual Sequence with Feedback'",
IEEE Transactions on Information Theory
**54**(2008): 3831--3841 - J. C. Willems, "Some remarks on the concept of information state," pp. 285--295 in O. L. R. Jacobs (ed.), Analysis and Optimization of Stochastic Systems
- Wan Yang, Jeffrey Shaman, "A simple modification for improving inference of non-linear dynamical systems", arxiv:1403.6804