Computational Mechanics

12 Jul 1999 19:33

(This notebook needs re-writing in a big way. Sorry; here's the old version, for what it's worth..)

Here are two big fuzzy problems which come up in lots of areas of science:

Computational mechanics is a research program, developed by Jim Crutchfield and henchmen, which aims to address both of these problems with the same set of tricks. Since I think this is an under-appreciated line of research, and because I became one of the aforesaid henchmen in January 1998, I'll explain this at some length.

You go to the lab and you take your favorite set of measurements, until you can't stand it any more, and you want to make something of the data. You might look for structures or patterns in your data: these may even be those emergent properties so fabled in legend and philosophy. To do this, you need a way of characterizing different structures or patterns, and one of doing this is to write down a procedure which will reproduce the pattern, to a certain level of abstraction and accuracy. Fortunately there's a lot of theory on the subject of "effective procedures" in general: it's an off-shoot of mathematical logic called computation theory, or automata theory, which turns out to be equivalent to the theory of formal languages. The languages or automata form a hierarchy, in which those at the higher levels can do anything those at the lower levels can: Turing machines perch at the very top of the heap. (The most basic form of the hierarchy was discovered by Chomsky back when he worked for the Air Force, and is called the Chomsky hierarchy.) Now, Occam's razor tells us to use the simplest procedure we can find, and there are plenty of good ways of saying how complicated an automaton is: the more interesting ones even say that highly random automata are simple.

The computational mechanics procedure, then, is to take your data, discretize it so you've only got a small "finite alphabet" to deal with, and then look for "causal states." Two histories, two series of past data, leave you in the same causal state if they leave you with the same distribution of future data, i.e., if it makes no difference to the future whether you saw one data-series or the other. This being the case, the difference between the series is unimportant, and we lump them together. This procedure identifies causal states, and also identifies the structure of connections or succession in causal states, and so automatically creates an automaton in the lowest Chomsky class you can get away with. (If, as you consider longer and longer stretches of data, you need more and more complicated automata, you go to the next most powerful class of automata and start over.) These automata are called "epsilon-machines" and the procedure "epsilon-machine reconstruction": the names are appalling, but I've not heard better ones.

The computation part of "computation, dynamics and inference" is pretty thoroughly in evidence: but the other two? Well, inference is easy: the machine-reconstruction procedure is an extended exercise in statistical inference, or machine learning, or induction (whichever you prefer). We're trying to fit our data to models of pre-specified classes , and to find the simplest, most accurate model we can. (Obviously there are trade-offs between simplicity and accuracy.) In principle, the whole process could be programmed, and encapsulated in a single piece of software: a "phenomenological engine" or "phenomenologimat" (J. Fetter), an automatic finder of empirical regularities. (Look for it in the next version of emacs.)

Dynamics is a bit trickier. Notice that we're just using our data, not some a priori model of the process, to try to say what the underlying mechanisms are like. But we may not be measuring the causally important variables at all, and very coarse discretization is (at least so far) key. The inspiration for the first, for trying to build a model straight from the data, is attractor-reconstruction in dynamics: given a time-series which is a reasonable function of the causal variables, one can reconstruct the attractor of a dynamical system and even find many of its quantitative properties. (The limits of "reasonable function of the causal variables" are set by Takens's Embedding Theorem.) Many people I talk to don't seem to think that the second point, the coarse discretization, needs motivation at all: usually they come up with something along the lines of, "well, it's already digitized when it gets on your computer, right?" This faith in our current technology is touching (and completely unfounded, as anyone who reads comp.risks knows); fortunately there're better motivations. It happens that for some dynamical systems one can prove that by the right coarse discretization you doesn't lose any information about the dynamical behavior, but only have to deal with strings of symbols from a finite alphabet and operations on those strings, instead of continuous numbers and non-linear mappings of those numbers. (This is called "symbolic dynamics", and leads into the thermodynamic formalism for chaos.) Computational mechanics adds on another layer, by trying to characterize the symbolic dynamics even more compactly, and in information-processing terms. (That said, one of my projects is to try reformulating computational mechanics so that it doesn't need any sort of discretization at all.)

Well, what good is this? First of all, it gives you a nice, compact description of your system. Second, it constrains any theories you come up with of the causal mechanisms at work, since they've got to be able to match your reconstructed machine, or one or the other is wrong. Third, it lets you characterize the different structures you find in a content-neutral way, and even quantify them: this is what first led me to computational mechanics, since I'm interested in quantifying self-organization, and quantifying the degree of organization or structure would be a step in the right direction. Fourth, there's some hope of being able to run the process backwards: given a kind of computation, what sorts of dynamical systems would implement it? How do we go from local, idiotic rules to desirable global properties?

I promised that this'll have something to say about understanding how natural objects --- notably, of course, brains --- can compute. At first there seems to be a big problem: natural objects are continuous dynamical systems with causally connected variables; computation or information-processing is the transformation of representations according to rules. This contrast has led to a lot of wailing and beating of breasts and a "dynamical systems approach to cognition" movement (rather more strong in philosophy than in-the-lab cognitive science) which claims that cognition is not computation. Now, at a hand-waving, in-principle level, I think this can be resolved pretty much in favor of old-school cognition-is-computation: computational mechanics shows that discrete computation is embedded in continuous dynamical systems. (Cf. Kitts's papers below.) Computational mechanics, however, offers the prospect of going beyond this, to actually seeing what kinds of computations different things are actually doing.

It will not have escaped the observant reader's attention that the first part of the program, structure-finding, got a lot more space than the second part, understanding natural computing. This is because the first part is much more developed than the second, which is still almost entirely programmatic. So far the only things to get detailed treatment from computational mechanics are formal systems, like the logistic map and cellular automata, though it has been used on a few experimental data sets: stochastic resonance, the "dripping faucet" experiment (famous in nonlinear dynamics, utterly obscure elsewhere), and some turbulent geophysical fluid flow data (A. J. Palmer, personal communication). I'd like to take some well-understood, well-behaved data-set, like the tides, or the positions of the Galilean moons, and see how well a phenomenological engine does on it. But I'm not too worried about the restricted range of applications to date: until we've got the structure-finding part down cold, there's not much point in trying to tackle something as hairy as the brain, and there's a lot of work to be done just in getting the techniques to work on spatially-extended systems.