Notebooks
http://bactra.org/notebooks
Cosma's NotebooksenRandom Time Changes for Stochastic Processes
http://bactra.org/notebooks/2017/02/27#random-time-changes
<P>There are a class of results about transforming
one <a href="stochastic-processes.html">stochastic process</a> into another by
stretching and shrinking the time-scale, sometimes in a deterministic manner,
more often by means of a <em>random</em> change of time-scale which depends on
the realized trajectory of the process you started with.
<P>The easiest example might be from point processes (which deserve a notebook
of their own, some day). The simplest point process is the homogeneous Poisson
process with "unit intensity": there is a constant probability per unit time of
an event happening, which we can normalize to 1 (if necessary by changing our
time unit). Events are thus laid down completely independently of one another.
The consequence is that the distance between successive events is exponentially
distributed, with mean 1. So you can imagine someone creating a realization of
a homogeneous Poisson process with unit intensity by winding up a kitchen timer
for a random, exponentially-distributed amount of time, and then putting down a
point whenever it buzzes, at which point it is immediately reset to a new,
totally independent count.
<P>Now suppose you have a homogeneous Poisson process which does not have unit
intensity, but one whose probability per unit time of an event
is \( \lambda \). To make a realization of this process look like a
realization of the standard Poisson process, simply take the time-axis and
rescale it by \( \lambda \) --- stretch out the separation between
points if the process has high intensity (so that points fall close together),
or compress them if the process has low intensity (so that points are widely
spaced). Symbolically, one defines a new time, \( \tau = t \lambda \),
and now \( X(\tau) \) looks like a realization of the standard Poisson
process, even though \( X(t) \) does not.
<P>If the Poisson process is <em>in</em>homogeneous, so that the probability
per unit time follows some fixed intensity function \( \lambda(t) \), then
the idea is similar, though the implementation is more complicated. We want
the new time to run slow and stretch out when the intensity is high, and we
want it to run fast and compress events when the intensity is low. It turns
out that the right definition is
\[
\tau(t) = \int_{0}^{t}{\lambda(s) ds}
\]
More specifically, if \( t_1, t_2, \ldots \) are the times of the events,
then the transformed times \( \tau(t_1), \tau(t_2), \ldots \) come from a
standard Poisson process, and \( \tau(t_{i+1}) - \tau(t_i) \) are IID and
exponentially distributed with mean 1. (This reduces to the previous
result when the intensity function is homogeneous.)
<P>Now suppose that the intensity function is not fixed, but depends on some
random inputs, including possibly the history of the process. We write this
as \( \lambda(t|\mathcal{H}_t) \), where \( \mathcal{H}_t \) is
supposed to summarize everything that goes into setting the intensity. (Even
more technically, each \( \mathcal{H}_t \) is a sigma-field, and the
collection of them over various <em>t</em> forms a filtration; the intensity is
a random process adapted to this filtration. I could get even more technical
if you make me.) If one now defines the rescaled time
\[
\tau(t) = \int_{0}^{t}{\lambda(s|\mathcal{H}_s) ds}
\]
it still the case that \( \tau(t_1), \tau(t_2), \ldots \) looks exactly
like a realization of a standard Poisson process. But notice that the way we
have re-scaled time is random, and possibly different from one realization of
the original point process to another, because the times at which events
happened can be included in the information represented
by \( \mathcal{H}_t \).
<P>This idea is not limited to point processes; one can transform many other
sorts of stochastic processes to standardized versions by the appropriate
random time changes. (A trivial example: If \( W(t) \) is a standard
Wiener process, then \( X(t) = \sigma W(t) \) is a non-standard Wiener
process, where \( X(t) - X(s) \sim \mathcal{N}(0,\sigma^2 |t-s|) \).
But \( X(t/\sigma^2) \) is then a standard Wiener process again.)
Generally speaking, to know how to do the transformation requires that one know
something about the structure and parameters of the process.
<P>This leads to what I can only call a brilliant little trick for doing
statistical inference on stochastic processes, first made explicit (so far as I
know) by Brown et al. If one is trying to fit a model to a point process, for
example, what's going on is that you have a set of possible guesses about the
conditional intensity function, say \( \phi(t|\mathcal{H}_t;\theta) \),
where \( \theta \) is a parameter indexing the different functions.
For one parameter value, call it \( \theta_0 \), this is actually
right:
\[
\phi(t|\mathcal{H}_t;\theta_0) = \lambda(t|\mathcal{H}_t)
\]
If your guess at \( \theta_0 \), call it \( \hat{\theta} \), is right, then you
can use \( \phi(t|\mathcal{H}_t;\hat{\theta}) \) to transform the original
point process into a realization of a standard Poisson process. And it is
really easy to test whether something <em>is</em> a realization of such a
process (are the inter-event times exponentially distributed with mean 1? are
the independent?). Notice that there are no free parameters in the hypothesis
one ends up testing. So you can estimate the parameter however you like, and
then do a back-end test on the goodness of fit.
<P>I would really like to know more about how this can be done for other
processes, and its strengths and limitations as e.g. a means
of <a hef="model-selection.html">model selection</a>.
<ul>Recommended:
<li>Emery N. Brown Riccardo Barbieri Valérie Ventura, Robert
E. Kass and Loren M. Frank, "The Time-Rescaling Theorem and Its Applications to
Neural Spike Train Data Analysis", <cite>Neural
Computation</cite> <strong>14</strong> (2002): 325--346
[<a href="http://www.stat.cmu.edu/~vventura/rescaling.pdf">PDF reprint</a>]
<li>Felipe Gerhard, Robert Haslinger, and Gordon Pipa, "Applying the Multivariate Time-Rescaling Theorem to Neural Population Models", <a href="http://dx.doi.org/"><cite>Neural Computation</cite> <strong>23</strong> (2011): 1452--1483</a>
<li>Robert Haslinger, Gordon Pipa and Emery Brown,
"Discrete Time Rescaling Theorem: Determining Goodness of Fit for
Discrete Time Statistical models of Neural Spiking",
<a href="http://dx.doi.org/10.1162/NECO_a_00015"><cite>Neural Computation</cite> <strong>22</strong> (2010): 2477--2506</a>
<li>Olav Kallenberg, <cite>Foundations of Modern Probability</cite>
</ul>
<ul>To read:
<li>Ole E. Barndorff-Nielsen and Albert Shiryaev, <cite><a href="http://www.worldscientific.com/worldscibooks/10.1142/9609">Change of Time and Change of Measure</a></cite>
<li>Uwe Küchler and Michael Sorensen, <cite>Exponential Families
of Stochastic Processes</cite>
</ul>
<hr>
<em>Previous versions</em>: 2007-11-27 20:11