Equations of Motion from a Time Series

Last update: 07 Jul 2025 13:45
First version: 28 June 2021

This is, of course, a phrase I've borrowed from my thesis adviser (and who was in turn referencing what's probably still his best-known and most influential paper). Let me try to explain.

Suppose I have a physical system whose state is represented by some (generally multi-dimensional) variable \( S \). The state changes over time; without loss of generality, at the initial time \( t=0 \) we have the state \( S(0) \). By "equations of motion" physicists generally mean equations which describe how the state changes over time, and (this is important) how it _would_ change if the initial condition was different. That is, it's not enough to describe one curve or trajectory; the equations of motion have to give all possible trajectories, at least within some non-trivial domain of application.

Assume for simplicity that the system is time-homogeneous. (Time-inhomogeneous stuff is basically similar but needs more notation.) Then the general way we'd write the equations of motion is to introduce a family of operators \( \Phi_t \) where \[ S(t) = \Phi_t S(0) \] and where, for self-consistency, the operators form a semi-group, \[ S(t+h) = \Phi_h \circ \Phi_t S(0) = \Phi_h S(t) = \Phi_t S(h) \]

Even more often, the equations of motion are differential equations, \[ \frac{dS}{dt} = L(S(t)) \] for some differential operator \( L \). I realize I'm being ridiculously abstract, so let's boil this down to something nice like the simple harmonic oscillator. The \( S \) is a two-dimensional vector, with the two dimensions being position and momentum, say \( S_1 \) and \( S_2 \) respectively, and the equations of motion are \[ \left(\begin{array}{c} \frac{dS_1}{dt} \\ \frac{dS_2}{dt} \end{array}\right) = \left( \begin{array}{c} \frac{S_2}{m} \\ -k S_1 \end{array}\right) \] (In this case the differential operator I called \( L \) is linear, but in general neither need be true.) You can convert this to the evolution-operator form by solving the differential equations.

So far so good on "equations of motion". (That's a lie, I seriously doubt anyone who didn't know what "equations of motion" meant followed any of that, but it at least fixed notation, and I might perhaps replace it with something genuinely explanatory later.) Now about the "from a time series" part. Suppose we have observations of the one state trajectory at some finite set of times, say \( s(t_1), s(t_2), \ldots s(t_n) \). Let's say we know, or are willing to believe, that the state evolves according to some differential equation, described by the differential operator \( L \). The goal then is to learn, induce or infer the differential operator \( L \), and thus the equations of motion. Or, if we can't do that, maybe we can learn the semi-group of \( \Phi_t \) for selected values of \( t \), perhaps just the powers of \( \Phi_h \) for some lag \( h \).

Obviously this problem is not solvable in general. There are multiple reasons for this. Here's a basic one: solutions of differential equations are (generally) continuous functions of time, but we've observed the trajectory at a finite number of time points, so we don't even know what the solution was. Plus, of course, the same function of time can be the solution to multiple differential equations! (This is where the bit about wanting to know what would happen if the initial conditions had been different becomes important.) So the whole problem is ill-posed and under-determined in many, many ways. One can nonetheless ask about a procedure which will be "consistent" in the (confusing) sense in which statisticians use the word, i.e., which will deliver a sequence of approximate answers which will get closer and closer to the truth as we get more and more data. "More and more data" here might mean: more and more trajectories at the same times \( t_1, \ldots t_n \) but starting from different initial conditions; longer and longer trajectories starting from the same initial condition; or "in-fill asymptotics", where we keep the initial and final times fixed but increase \( n \), so the time elapsed between state measurements shrinks.

Here is a very crude approach. Suppose that the intervals \( t_i - t_{i-1} \) are all equal to \( h \), and that \( h \) is small, so we can say \[ \frac{dS}{dt} \approx \frac{S(t+h) - S(t)}{h} \] Now say \( Y_{i} = \frac{S(t_{i+1}) - S(t_{i})}{h} \) and \( X_i = S(t_i) \) for \( i \in 1:(n-1) \), and run your favorite nonparametric regression of \( Y \) on \( X \). Under some very broad conditions (starting with the process being ergodic and mixing), this will actually give a consistent estimate of \( \Phi(h) \). Getting a consistent estimate of \( L \) is a little trickier but also certainly possible in an in-fill regime.

Now, nonparametric regression methods can be notoriously hard to interpret (cough neural networks cough), whereas we might want clean equations of motion. One would hope that this could be achieved through regularization and penalties, but that's a question for investigation.

Obvious complications are obvious: what if we don't observe \( S(t) \) but \( X(t) = f(S(t)) \)? What if we observe through noise? What if there is a stochastic component to the dynamics?

James P. Crutchfield and Bruce S. McNamara, "Equations of Motion from a Data Series", Complex Systems 1 (1987): 417--452 [PDF reprint via Prof. Crutchfield]

Xiaowu Dai, Lexin Li, "Kernel Ordinary Differential Equations", arxiv:2008.02915

Kathleen Champion, Bethany Lusch, J. Nathan Kutz, Steven L. Brunton, "Data-driven discovery of coordinates and governing equations", arxiv:1904.02107 [Pretty sure that the "first method of its kind" stuff runs smack into Crutchfield & McNamara...]
Toby S. Cubitt, Jens Eisert, Michael M. Wolf, "Extracting dynamical equations from experimental data is NP-hard", Physical Review Letters 108 (2012): 120503, arxiv:1005.0005
Jinchao Feng, Yunxiang Ren, Sui Tang, "Data-driven discovery of interacting particle systems using Gaussian processes", arxiv:2106.02735 Gevik Grigorian, Sandip V. George, Simon Arridge, "Learning Governing Equations of Unobserved States in Dynamical Systems", arxiv:2404.18572
Ziming Liu and Max Tegmark, "Machine Learning Conservation Laws from Trajectories", Physical Review Letters 126 (2021): 180604
Ryan Lopez, Paul J. Atzberger, "GD-VAEs: Geometric Dynamic Variational Autoencoders for Learning Nonlinear Dynamics and Dimension Reductions", arxiv:2206.05183
Peter Y. Lu, Samuel Kim, Marin Soljacic, "Extracting Interpretable Physical Parameters from Spatiotemporal Systems using Unsupervised Learning", Physical Review X 10 (2020): 031056, arxiv:1907.06011
Suryanarayana Maddu, Bevan L. Cheeseman, Christian L. Müller, and Ivo F. Sbalzarini, "Learning physically consistent differential equation models from data using group sparsity", Physical Review E 104 (2021): 042310
Yahya Sattar, Samet Oymak, "Non-asymptotic and Accurate Learning of Nonlinear Dynamical Systems", Journal of Machine Learning Research 23:140 (2022): 1−49
Agustín Somacal, Yamila Barrera, Leonardo Boechi, Matthieu Jonckheere, Vincent Lefieux, Dominique Picard, Ezequiel Smucler, "Uncovering differential equations from data with hidden variables", arxiv:2002.02250
Fangzheng Sun, Yang Liu, Hao Sun, "Physics-informed Spline Learning for Nonlinear Dynamics Discovery", arxiv:2105.02368
Ye Yuan, Junlin Li, Liang Li, Frank Jiang, Xiuchuan Tang, Fumin Zhang, Sheng Liu, Jorge Goncalves, Henning U.Voss, Xiuting Li, Jürgen Kurths, Han Ding, "Machine Discovery of Partial Differential Equations from Spatiotemporal Data", arxiv:1909.06730