## Computational Mechanics

*12 Jul 1999 19:33*

(This notebook needs re-writing in a big way. Sorry; here's the old version, for what it's worth..)

Here are two big fuzzy problems which come up in lots of areas of science:

- How do we find patterns in our data?
- How do some natural objects compute, or process information?

You go to the lab and you take your favorite set of measurements, until you can't stand it any more, and you want to make something of the data. You might look for structures or patterns in your data: these may even be those emergent properties so fabled in legend and philosophy. To do this, you need a way of characterizing different structures or patterns, and one of doing this is to write down a procedure which will reproduce the pattern, to a certain level of abstraction and accuracy. Fortunately there's a lot of theory on the subject of "effective procedures" in general: it's an off-shoot of mathematical logic called computation theory, or automata theory, which turns out to be equivalent to the theory of formal languages. The languages or automata form a hierarchy, in which those at the higher levels can do anything those at the lower levels can: Turing machines perch at the very top of the heap. (The most basic form of the hierarchy was discovered by Chomsky back when he worked for the Air Force, and is called the Chomsky hierarchy.) Now, Occam's razor tells us to use the simplest procedure we can find, and there are plenty of good ways of saying how complicated an automaton is: the more interesting ones even say that highly random automata are simple.

The computational mechanics procedure, then, is to take your data,
discretize it so you've only got a small "finite alphabet" to deal with, and
then look for "causal states." Two histories, two series of past data, leave
you in the same causal state if they leave you with the same distribution
of *future* data, i.e., if it makes no difference to the future whether
you saw one data-series or the other. This being the case, the difference
between the series is unimportant, and we lump them together. This procedure
identifies causal states, and also identifies the structure of connections or
succession in causal states, and so automatically creates an automaton in
the lowest Chomsky class you can get away with. (If, as you consider longer
and longer stretches of data, you need more and more complicated automata, you
go to the next most powerful class of automata and start over.) These automata
are called "epsilon-machines" and the procedure "epsilon-machine
reconstruction": the names are appalling, but I've not heard better ones.

The computation part of "computation, dynamics and inference" is pretty
thoroughly in evidence: but the other two? Well, *inference* is easy:
the machine-reconstruction procedure is an extended exercise in statistical inference, or machine
learning, or induction (whichever you prefer). We're trying to fit our
data to models of pre-specified classes , and to find the simplest, most
accurate model we can. (Obviously there are trade-offs between simplicity and
accuracy.) In principle, the whole process could be programmed, and
encapsulated in a single piece of software: a "phenomenological engine" or
"phenomenologimat" (J. Fetter), an automatic finder of empirical
regularities. (Look for it in the next version of emacs.)

*Dynamics* is a bit trickier. Notice that we're just using our data,
not some *a priori* model of the process, to try to say what the
underlying mechanisms are like. But we may not be measuring the causally
important variables at all, and very coarse discretization is (at least so far)
key. The inspiration for the first, for trying to build a model straight from
the data, is attractor-reconstruction in dynamics: given a time-series which is
a reasonable function of the causal variables, one can reconstruct the
attractor of a dynamical system and even find many of its quantitative
properties. (The limits of "reasonable function of the causal variables" are
set by Takens's Embedding
Theorem.) Many people I talk to don't seem to think that the second point,
the coarse discretization, needs motivation at all: usually they come up with
something along the lines of, "well, it's already digitized when it gets on
your computer, right?" This faith in our current technology is touching (and
completely unfounded, as anyone who reads
comp.risks knows); fortunately there're better
motivations. It happens that for some dynamical systems one can prove that by
the *right* coarse discretization you doesn't lose any information about
the dynamical behavior, but only have to deal with strings of symbols from a
finite alphabet and operations on those strings, instead of continuous numbers
and non-linear mappings of those numbers. (This is called "symbolic
dynamics", and leads into the thermodynamic formalism for chaos.)
Computational mechanics adds on another layer, by trying to characterize the
symbolic dynamics even more compactly, and in information-processing terms.
(That said, one of my projects is to try reformulating computational mechanics
so that it doesn't need any sort of discretization at all.)

Well, what good is this? First of all, it gives you a nice, compact description of your system. Second, it constrains any theories you come up with of the causal mechanisms at work, since they've got to be able to match your reconstructed machine, or one or the other is wrong. Third, it lets you characterize the different structures you find in a content-neutral way, and even quantify them: this is what first led me to computational mechanics, since I'm interested in quantifying self-organization, and quantifying the degree of organization or structure would be a step in the right direction. Fourth, there's some hope of being able to run the process backwards: given a kind of computation, what sorts of dynamical systems would implement it? How do we go from local, idiotic rules to desirable global properties?

I promised that this'll have something to say about understanding how
natural objects --- notably, of course, brains --- can compute. At first there
seems to be a big problem: natural objects are continuous dynamical systems
with causally connected variables; computation or information-processing is the
transformation of representations according to rules. This contrast has led to
a lot of wailing and beating of breasts and a "dynamical systems approach to
cognition" movement (rather more strong in philosophy than in-the-lab
cognitive science) which claims that cognition is *not* computation.
Now, at a hand-waving, in-principle level, I think this can be resolved pretty
much in favor of old-school cognition-is-computation: computational mechanics
shows that discrete computation is embedded in continuous dynamical systems.
(Cf. Kitts's papers below.) Computational mechanics, however, offers the
prospect of going beyond this, to actually seeing what kinds of computations
different things are actually doing.

It will not have escaped the observant reader's attention that the first
part of the program, structure-finding, got a lot more space than the second
part, understanding natural computing. This is because the first part
*is* much more developed than the second, which is still almost entirely
programmatic. So far the only things to get detailed treatment from
computational mechanics are formal systems, like the logistic map and cellular automata, though it *has*
been used on a few experimental data sets: stochastic resonance, the "dripping
faucet" experiment (famous in nonlinear dynamics, utterly obscure elsewhere),
and some turbulent geophysical fluid flow data (A. J. Palmer, personal
communication). I'd like to take some well-understood, well-behaved data-set,
like the tides, or the positions of the Galilean moons, and see how well a
phenomenological engine does on it. But I'm not too worried about the
restricted range of applications to date: until we've got the structure-finding
part down cold, there's not much point in trying to tackle something as hairy
as the brain, and there's a *lot* of work to be done just in getting the
techniques to work on spatially-extended systems.

- Recommended:
- When I gave a pair of lectures on computational mechanics to the Santa Fe summer school, the students made me draw up a Reading List, which see. (This includes some papers of mine; I never claimed modesty among my virtues.)
- Naturally, I'm partial to my own notes for those lectures, but they're technical, fairly mathematically rigorous, and in post-script only (the various tex to html conversion programs dislike it extremely...). My talk on "Phenomenologial Engines", which I gave to the Madison Chaos and Complex Systems Seminar (16 Sept. 1997), can be groked by professors of art history, but I'm not finished with tweaking it yet...
- Almost everything every written on the subject is in the Computational
Mechanics Archive.
- Some neat stuff not on the reading list:
- Richard W. Clarke, Mervyn P. Freeman, Nicholas W. Watkins, "The Application of Computational Mechanics to the Analysis of Geomagnetic Data," cond-mat/0110228
- Jordgi Delgado and Ricard V. Solé, "Characterizing
Turbulence in Globally Coupled Maps with Stochastic Finite Automata,"
Physics Letters A
**314**(2000): 314--319 - Brendan Kitts, Philosophy and Cognitive Science (Review) and "Much Ado about Dynamics".
- Mitchell, Crutchfield, and Hraber, "Dynamics, Computation, and the `Edge of Chaos': A Re-Examination", [Disposing of the "edge of chaos" hypothesis]
- Wesley C. Salmon, Scientific Explanation the Causal Structure of the World [Trembles on the brink of defining causal states, but comes at this from a very different perspective, namely rather old-fashioned philosophy of science]

- To read
- Gustavo Deco and Bernd Schurmann, Information Dynamics: Foundations and Applications
- Ryan G. James, John R. Mahoney, Christopher J. Ellison, James P. Crutchfield, "Many Roads to Synchrony: Natural Time Scales and Their Algorithms", arxiv:1010.5545
- B. D. Johnson, J. P. Crutchfield, C. J. Ellison, C. S. McTague, "Enumerating Finitary Processes", arxiv:1011.0036
- M. Raijmakers, Epigenesis in Neural Network Models of Cognitive Development: Bifurcations, More Powerful Structures, and Cognitive Concepts [Ph.D. thesis, Amsterdam]