Equations of Motion from a Time Series

21 Jul 2022 09:16

This is, of course, a phrase I've borrowed from my thesis adviser (and who was in turn referencing what's probably still his best-known and most influential paper). Let me try to explain.

Suppose I have a physical system whose state is represented by some (generally multi-dimensional) variable \( S \). The state changes over time; without loss of generality, at the initial time \( t=0 \) we have the state \( S(0) \). By "equations of motion" physicists generally mean equations which describe how the state changes over time, and (this is important) how it _would_ change if the initial condition was different. That is, it's not enough to describe one curve or trajectory; the equations of motion have to give all possible trajectories, at least within some non-trivial domain of application.

Assume for simplicity that the system is time-homogeneous. (Time-inhomogeneous stuff is basically similar but needs more notation.) Then the general way we'd write the equations of motion is to introduce a family of operators \( \Phi_t \) where \[ S(t) = \Phi_t S(0) \] and where, for self-consistency, the operators form a semi-group, \[ S(t+h) = \Phi_h \circ \Phi_t S(0) = \Phi_h S(t) = \Phi_t S(h) \]

Even more often, the equations of motion are differential equations, \[ \frac{dS}{dt} = L(S(t)) \] for some differential operator \( L \). I realize I'm being ridiculously abstract, so let's boil this down to something nice like the simple harmonic oscillator. The \( S \) is a two-dimensional vector, with the two dimensions being position and momentum, say \( S_1 \) and \( S_2 \) respectively, and the equations of motion are \[ \left(\begin{array}{c} \frac{dS_1}{dt} \\ \frac{dS_2}{dt} \end{array}\right) = \left( \begin{array}{c} \frac{S_2}{m} \\ -k S_1 \end{array}\right) \] (In this case the differential operator I called \( L \) is linear, but in general neither need be true.) You can convert this to the evolution-operator form by solving the differential equations.

So far so good on "equations of motion". (That's a lie, I seriously doubt anyone who didn't know what "equations of motion" meant followed any of that, but it at least fixed notation, and I might perhaps replace it with something genuinely explanatory later.) Now about the "from a time series" part. Suppose we have observations of the one state trajectory at some finite set of times, say \( s(t_1), s(t_2), \ldots s(t_n) \). Let's say we know, or are willing to believe, that the state evolves according to some differential equation, described by the differential operator \( L \). The goal then is to learn, induce or infer the differential operator \( L \), and thus the equations of motion. Or, if we can't do that, maybe we can learn the semi-group of \( \Phi_t \) for selected values of \( t \), perhaps just the powers of \( \Phi_h \) for some lag \( h \).

Obviously this problem is not solvable in general. There are multiple reasons for this. Here's a basic one: solutions of differential equations are (generally) continuous functions of time, but we've observed the trajectory at a finite number of time points, so we don't even know what the solution was. Plus, of course, the same function of time can be the solution to multiple differential equations! (This is where the bit about wanting to know what would happen if the initial conditions had been different becomes important.) So the whole problem is ill-posed and under-determined in many, many ways. One can nonetheless ask about a procedure which will be "consistent" in the (confusing) sense in which statisticians use the word, i.e., which will deliver a sequence of approximate answers which will get closer and closer to the truth as we get more and more data. "More and more data" here might mean: more and more trajectories at the same times \( t_1, \ldots t_n \) but starting from different initial conditions; longer and longer trajectories starting from the same initial condition; or "in-fill asymptotics", where we keep the initial and final times fixed but increase \( n \), so the time elapsed between state measurements shrinks.

Here is a very crude approach. Suppose that the intervals \( t_i - t_{i-1} \) are all equal to \( h \), and that \( h \) is small, so we can say \[ \frac{dS}{dt} \approx \frac{S(t+h) - S(t)}{h} \] Now say \( Y_{i} = \frac{S(t_{i+1}) - S(t_{i})}{h} \) and \( X_i = S(t_i) \) for \( i \in 1:(n-1) \), and run your favorite nonparametric regression of \( Y \) on \( X \). Under some very broad conditions (starting with the process being ergodic and mixing), this will actually give a consistent estimate of \( \Phi(h) \). Getting a consistent estimate of \( L \) is a little trickier but also certainly possible in an in-fill regime.

Now, nonparametric regression methods can be notoriously hard to interpret (cough neural networks cough), whereas we might want clean equations of motion. One would hope that this could be achieved through regularization and penalties, but that's a question for investigation.

Obvious complications are obvious: what if we don't observe \( S(t) \) but \( X(t) = f(S(t)) \)? What if we observe through noise? What if there is a stochastic component to the dynamics?

See also: State-Space Reconstruction; Inference for Stochastic Differential Equations; Prediction Processes; Markovian (and Conceivably Causal) Representations of Stochastic Processes; Koopman Operators for Modeling Dynamical Systems and Time Series