## Equations of Motion from a Time Series

*11 Apr 2022 17:37*

This is, of course, a phrase I've borrowed from my thesis adviser (and who was in turn referencing what's probably still his best-known and most influential paper). Let me try to explain.

Suppose I have a physical system whose state is represented by some (generally multi-dimensional) variable \( S \). The state changes over time; without loss of generality, at the initial time \( t=0 \) we have the state \( S(0) \). By "equations of motion" physicists generally mean equations which describe how the state changes over time, and (this is important) how it _would_ change if the initial condition was different. That is, it's not enough to describe one curve or trajectory; the equations of motion have to give all possible trajectories, at least within some non-trivial domain of application.

Assume for simplicity that the system is time-homogeneous. (Time-inhomogeneous stuff is basically similar but needs more notation.) Then the general way we'd write the equations of motion is to introduce a family of operators \( \Phi_t \) where \[ S(t) = \Phi_t S(0) \] and where, for self-consistency, the operators form a semi-group, \[ S(t+h) = \Phi_h \circ \Phi_t S(0) = \Phi_h S(t) = \Phi_t S(h) \]

Even more often, the equations of motion are differential equations, \[ \frac{dS}{dt} = L(S(t)) \] for some differential operator \( L \). I realize I'm being ridiculously abstract, so let's boil this down to something nice like the simple harmonic oscillator. The \( S \) is a two-dimensional vector, with the two dimensions being position and momentum, say \( S_1 \) and \( S_2 \) respectively, and the equations of motion are \[ \left(\begin{array}{c} \frac{dS_1}{dt} \\ \frac{dS_2}{dt} \end{array}\right) = \left( \begin{array}{c} \frac{S_2}{m} \\ -k S_1 \end{array}\right) \] (In this case the differential operator I called \( L \) is linear, but in general neither need be true.) You can convert this to the evolution-operator form by solving the differential equations.

So far so good on "equations of motion". (That's a lie, I seriously doubt anyone who didn't know what "equations of motion" meant followed any of that, but it at least fixed notation, and I might perhaps replace it with something genuinely explanatory later.) Now about the "from a time series" part. Suppose we have observations of the one state trajectory at some finite set of times, say \( s(t_1), s(t_2), \ldots s(t_n) \). Let's say we know, or are willing to believe, that the state evolves according to some differential equation, described by the differential operator \( L \). The goal then is to learn, induce or infer the differential operator \( L \), and thus the equations of motion. Or, if we can't do that, maybe we can learn the semi-group of \( \Phi_t \) for selected values of \( t \), perhaps just the powers of \( \Phi_h \) for some lag \( h \).

Obviously this problem is not solvable in general. There are multiple
reasons for this. Here's a basic one: solutions of differential equations are
(generally) continuous functions of time, but we've observed the trajectory at
a *finite* number of time points, so we don't even know what the
solution was. Plus, of course, the same function of time can be the solution
to multiple differential equations! (This is where the bit about wanting to
know what would happen if the initial conditions had been different becomes
important.) So the whole problem is ill-posed and under-determined in many,
many ways. One can nonetheless ask about a procedure which will be
"consistent" in the (confusing) sense in which statisticians use the word,
i.e., which will deliver a sequence of approximate answers which will get
closer and closer to the truth as we get more and more data. "More and more
data" here might mean: more and more trajectories at the same times \( t_1,
\ldots t_n \) but starting from different initial conditions; longer and longer
trajectories starting from the same initial condition; or "in-fill
asymptotics", where we keep the initial and final times fixed but increase \( n
\), so the time elapsed between state measurements shrinks.

Here is a *very* crude approach. Suppose that the intervals \( t_i - t_{i-1} \) are all equal to \( h \), and that \( h \) is small, so we can say
\[
\frac{dS}{dt} \approx \frac{S(t+h) - S(t)}{h}
\]
Now say \( Y_{i} = \frac{S(t_{i+1}) - S(t_{i})}{h} \) and \( X_i = S(t_i) \) for \( i \in 1:(n-1) \), and run your favorite nonparametric regression of \( Y \) on \( X \). Under some very broad conditions (starting with the process being ergodic and mixing), this will actually give a consistent estimate of \( \Phi(h) \). Getting a consistent estimate of \( L \) is a little trickier but also
certainly possible in an in-fill regime.

Now, nonparametric regression methods can be notoriously hard to interpret
(cough neural networks cough), whereas we might want *clean* equations of
motion. One would hope that this could be achieved through regularization
and penalties, but that's a question for investigation.

Obvious complications are obvious: what if we don't observe \( S(t) \) but \( X(t) = f(S(t)) \)? What if we observe through noise? What if there is a stochastic component to the dynamics?

See also: State-Space Reconstruction; Inference for Stochastic Differential Equations; Prediction Processes; Markovian (and Conceivably Causal) Representations of Stochastic Processes

- Recommended, big picture:
- James P. Crutchfield and Bruce S. McNamara, "Equations of Motion from a Data Series", Complex Systems
**1**(1987): 417--452 [PDF reprint via Prof. Crutchfield]

- Recommended, close-ups:
- Xiaowu Dai, Lexin Li, "Kernel Ordinary Differential Equations", arxiv:2008.02915

- To read [A lot of these abstracts provoke my skepticism, "to be shot after a fair trial", as my mother used to say]:
- Kathleen Champion, Bethany Lusch, J. Nathan Kutz, Steven L. Brunton, "Data-driven discovery of coordinates and governing equations", arxiv:1904.02107 [Pretty sure that the "first method of its kind" stuff runs smack into Crutchfield & McNamara...]
- Toby S. Cubitt, Jens Eisert, Michael M. Wolf, "Extracting dynamical equations from experimental data is NP-hard", Physical Review Letters
**108**(2012): 120503, arxiv:1005.0005 - Jinchao Feng, Yunxiang Ren, Sui Tang, "Data-driven discovery of interacting particle systems using Gaussian processes", arxiv:2106.02735
- Ziming Liu and Max Tegmark, "Machine Learning Conservation Laws from Trajectories", Physical Review Letters
**126**(2021): 180604 - Peter Y. Lu, Samuel Kim, Marin Soljacic, "Extracting Interpretable Physical Parameters from Spatiotemporal Systems using Unsupervised Learning",
Physical Review X
**10**(2020): 031056, arxiv:1907.06011 - Suryanarayana Maddu, Bevan L. Cheeseman, Christian L. Müller, and Ivo F. Sbalzarini, "Learning physically consistent differential equation models from data using group sparsity", Physical Review E
**104**(2021): 042310 - Agustín Somacal, Yamila Barrera, Leonardo Boechi, Matthieu Jonckheere, Vincent Lefieux, Dominique Picard, Ezequiel Smucler, "Uncovering differential equations from data with hidden variables", arxiv:2002.02250
- Fangzheng Sun, Yang Liu, Hao Sun, "Physics-informed Spline Learning for Nonlinear Dynamics Discovery", arxiv:2105.02368
- Ye Yuan, Junlin Li, Liang Li, Frank Jiang, Xiuchuan Tang, Fumin Zhang, Sheng Liu, Jorge Goncalves, Henning U.Voss, Xiuting Li, Jürgen Kurths, Han Ding, "Machine Discovery of Partial Differential Equations from Spatiotemporal Data", arxiv:1909.06730