Hidden Markov Models and Dynamical Systems

<title>Review of Andrew M. Fraser, Hidden Markov Models and Dynamical Systems</title>

<cite><a href="../">The Bactra Review: Occasional and eclectic book reviews by
Cosma Shalizi</a></cite> &nbsp; 139

<h1>Hidden Markov Models and Dynamical Systems</h1>
<h2><em>by</em> <a href="../authors.html#andrew-fraser">Andrew M. Fraser</a></h2>

Philadelphia: Society for Industrial and Applied Mathematics Press, 2008

<hr>

<h4>Statistics of Moving Shadows</h4>

<P>In a dynamical system, such as one finds in classical mechanics, the current
state of the system uniquely determines the whole future trajectory.  (This
basically <em>defines</em> the concept of "state".)  Markov models are the
natural generalization of dynamical systems to situations where strict
determinism does not apply; rather than the current state fixing the whole
future trajectory, it fixes the distribution over future trajectories.  Rather
than having a perfect, idealized machine, one has a machine with some internal
noise and fluctuations.

<P>Hidden Markov models are one natural way of generalizing Markov models to
situations where direct measurement of the state is not possible: the current
state is enough to fix the distribution of observations, but observations have
no real effect on states.  (There are other ways to handle imperfect
measurement.)  Trying to understand a hidden Markov model from its observed
time series is like trying to figure out the workings of a noisy machine from
looking at the shadows its moving parts cast on a wall, with the proviso that
the shadows are cast by a randomly-flickering candle.

<P>Reverse-engineering by candle-light is, of course, much closer to our usual
situation in dealing with nature than is ideal mechanics, and HMMs are an
extremely useful class of model for real data. Unfortunately, a lot of the
presentations of the theory for HMMs tend to be pretty arcane.  This book
provides a readable and practical introduction to HMMs and their basic
algorithms suitable for anyone with a basic grasp of linear algebra, dynamics,
and probability.  I would expect that final-year undergraduates in physics,
engineering, math, etc., should be able to profit from it.  Statistics and
machine learning students may in fact have a harder time with it, because of
their lack of familiarity with dynamics and differential equations.  (I should
be able to test that guess by the end of the spring semester.)

<P>Chapter 1 introduces HMMs as models of dynamics, including running examples
based on the Lorenz equation and an actual laser system which closely
approximates it.  Chapter 2 derives the basic algorithms for discrete-state,
discrete-observation, discrete-time HMMs: the forward algorithm for calculating
the probability of an observation sequence; the backward algorithm for
calculating the probability of a state sequence; the Viterbi algorithm for
finding the most likely state sequence; and the
Baum-Welch/forward-backward/expectation-maximization algorithm for
maximum-likleihood parameter estimation, including a nicely physical discussion
of why EM converges.  Chapter 3 looks at some extensions to the basic model and
the corresponding changes in algorithms, while chapter 4 goes to
continuous-state, continuous-observation HMMs, a.k.a. "state-space models", and
derives the Kalman filter for linear, Gaussian HMMs.  It also touches on the
more complicated methods needed for fully nonlinear, non-Gaussian
processes.

<P>Professionals will not of course learn much from the first four chapters,
but should still look at chapter 5, which contains natural, but so far as I
know novel, observations linking bounds on the likelihood to the Lyapunov
exponents of dynamical systems.

<P>Chapter 6 is an example of using HMMs to do classification on real
physiological time series ("is the patient breathing or not?"), which brings
out the need to modify the basic algorithms for this sort of task.  I suspect
this will be pedagogically very useful, since students are much too apt to
think of standard algorithms as terminals rather than starting-points, and
it'll do them good to see how to re-work them.

<P>The one draw-back from using this book in the classroom is that instructors
would have to make up their own problem sets.  However, all
the <a href="http://www.siam.org/books/ot107">code</a> is available (in
Python), so the lazy teacher could just assign students to reproduce the
examples in the text.

<P><em>Disclaimer</em>: Andy is an acquaintance, and he was kind enough to let
me read an earlier draft a few years ago, and to send me a copy now.  However,
I have no stake in the success of this book.

<hr>

<P>xii+132 pp., bibliography, many line and color figures, index

<P>
	<a href="../subjects/probability.html">Probability and Statistics</a>

<P>In print as a paperback, ISBN  978-0-898716-65-8,
US$55 [<a href="http://www.powells.com/partner/27627/biblio/9780898716658">Buy from Powell's</a>]

<hr>

17 December 2008