The Bactra Review: Occasional and eclectic book reviews by Cosma Shalizi   41


Hierarchical Structures and Scaling in Physics

by Remo Badii and Antonio Politi

Cambridge Nonlinear Science Series, vol. 6. Cambridge University Press, 1997
"A scientist," our authors begin by saying, "when confronted with a complex problem, feels a sensation of distress that is often not attributable to a definite cause: it is commonly associated with the inability to discriminate the fundamental constituents of the system or to describe their interrelations in a concise way." The colleagues of a scientist who claims to work on complexity suffer another form of distress, with a much simpler etiology: they are afraid that their friend has become a philosopher, or at the very least gone insane. The aim of this book is to reduce both kinds of complexity-related misery to ordinary scientific unhappiness. Colleagues are easily cured: making complexity look like a branch of statistical physics almost always works. It's harder to bring relief to those actually working on complexity, and the strategy for doing so turns on the issue of "concise description" ; there's a lot of ground to be covered before we get there.

After some philosophical-methodological hand-waving, which really only acquires content from the mathematical specifics in the rest of the book, we begin with a bestiary of some systems which are held to be complex within the meaning of the act, and reflect our authors' backgrounds in theoretical physics: fluid and optical instabilities, turbulence, chemical oscillators, DNA. This same background leads them into some unfortunate mistakes about the state of biological theory (indeed, in the preface, they date "the construction of the first mathematical models in evolutionary biology and neuroscience" to the last few years; I hope the news is broken gently to Lotka, Fisher, Haldane, Wright, McCulloch, Pitts and Hebb). This is of little consequence, however, since real-world data almost disappears after this point, though DNA sequences pop up from time to time in examples. Instead we plunge into a sea of abstract mathematical objects, and the next chapter rounds up the usual suspects among them --- nonlinear flows and maps, cellular automata (huzzah!), spin glasses, neural networks.

Some of these (CA, spin glasses) are already concatenations of discrete symbols, drawn from what is known in the trade as a "finite alphabet." Not just unity of presentation, but the whole approach taken by the book, demand that we find a way of casting all the other systems, which take on continuous values, into this form as well. Fortunately it turns out that there already exists a branch of mathematics --- symbolic dynamics --- for doing just this. One divides the state space of a map (not a restriction: Poincaré sections turn flows into maps) into a finite number of partitions, each with their own label (thus the finite alphabet). Each point gets its own sequence of symbols: first that of the partition it is in, then that of the partition its iterate is in, then that of the partition of its iterate's iterate, and so on ad infinitum. If the original partition was chosen well, then every point in the map's state space has a unique string of symbols, and vice versa, and shifting a string to the left is equivalent to iterating the map. When we can do this, considering shifts of these strings is just as good as worrying about the original dynamics of the map, and much more amenable to the implements in our authors' toolkit. Unfortunately, there are proofs that it is possible, that the state space can be so partitioned, only for a comparatively small class of special cases. (Some people actually have the cojones to argue that this isn't troubling, because digital instruments discretize everything anyway; Badii and Politi have more sense.) Since there don't seem to be any maps where it is known that it is definitely impossible to construct a corresponding symbolic dynamics, however, we optimistically assume that anything we really care about can be represented by symbol sequences, and press on.

In general, not every possible string of symbols can actually be generated by the dynamics of the system we're considering. The set of allowed sequences is called the language; its elements, the allowed words (since they're made up of elements from the alphabet). Alas for consistency of metaphor, the rules which allow one to recognize (or, equivalently, generate) the allowed words of the language are said to be its grammar. To each grammar there corresponds an abstract machine or automaton, which can itself recognize (or generate) the words of the language.

The theories of formal languages and of automata underlie one half of complexity; the other half is supported by concepts from probability and information theory, to which the book now turns. After a brief whiff of probability theory (stochastic processes, correlations, power spectra), presumably just to jog the memory, it sinks its teeth into ergodic behavior, information theory, and fractal dimensions. All three concern the distribution of probability over the state space. The first is about the long-time behavior of the dynamics, and the invariant and attracting distributions; ergodicity itself is the property, important for statistical mechanics, of averages over state space coinciding with averages over time. The information theory considered is fairly basic, little more than the Shannon and Kolmogorov-Sinai entropies, basically quantities which tell us about how much uncertainty the distribution leaves us in about our position in state space, and what the time-evolution does to this uncertainty. The fractal dimensions, of course, tell us about how the distribution is affected by changes in scale, in resolution.

All this naturally feeds into the so-called thermodynamic formalism, a way of making symbol sequences look like spin systems, so we can exapt everything we've learned since Ising about the latter to studying the former, and so ultimately to dynamics. (Pace Prigogine, there is no known or even plausible connection between ordinary thermodynamics and complexity.) We pretend that the energy of a symbol sequence is the negative log of the probability of the sequence; this gives us a Hamiltonian; a parameter q, which tunes how much weight the more probable sequences get, stands in for an inverse temperature. From here it is a fairly short step (the transformation from Helmholtz to Gibbs free energy) to computing the entire spectrum of fractal dimensions. A different, slightly longer step, takes us to phase transitions and getting lots of scaling information from the renormalization group. (Using renormalization directly on the dynamics, à la Feigenbaum, is not touched upon.) All this is good stuff, but probably too much for the amount of space alloted, at least as a first exposure to the thermodynamic formalism; students would probably do better to read Beck and Schlögl's Thermodynamics of Chaotic Systems for this material, and also that in the second half of chapter five.

Chapter seven returns to languages and automata, in particular the Chomsky hierarchy, Uncle Noam's most secure claim on immortality. Each language, remember, has a grammar (a generative grammar, no less!) and a corresponding automaton. (As Chomsky wouldn't say: Eine Sprache, eine Grammatik, ein Automat!) By imposing restrictions on the forms the grammatical rules can take, or, equivalently, on the memory available to the automaton, we can divide all languages into four nested classes. Toiling at the base are the members of the smallest, weakest and most restricted class, the "regular languages" generated by automata within only a fixed, finite memory for past symbols; above them are the "context free" languages, whose grammars do not depend on context (natch!); then the context-sensitive languages; and at the very top, lolling in their ease, the unrestricted languages, generated by Turing machines. (Chomsky's assault on behaviorist linguistics hinged, in part, on conditioning only being able to inculcate the rules for finite-state machines, while human natural languages are at least context-free.) Each stage in the hierarchy can simulate all those beneath it.

After a brief exposition of formal language theory, we turn to the "physical characterization of formal languages" in different classes; but this is really about their dynamical properties, of the sorts discussed in previous chapters --- correlation functions, entropies, fractal dimensions. (It seems a bit unnatural to talk about the dynamical properties of languages, but symbolic dynamics gives us a correspondence between strings and trajectories.) The chapter closes by talking about the languages classes various complex systems, like CA and DNA sequences, fall into, where anything is actually known on the subject.

Now we come to the payoff, to complexity itself. At the very beginning we said that something is complex if it can't be concisely described. This suggests equating complexity with the length of the description, and our experience with languages and their grammars ("infinite use of finite means" and all that) suggests that a generating mechanism can be much more concise than what it generates. Or it suggests these things to you if you're a Russian mathematical polymath of genius named Andrei Kolmogorov, from whom all complexity measures ultimately descend. He defined the complexity of a single symbol sequence as the length of the shortest program for a universal Turing machine which would reproduce it exactly. (Nowadays, this quantity is known as the Kolmogorov complexity, or the algorithmic information.) Unfortunately, this turns out to be a measure of randomness more than complexity --- a program of fixed length can generate arbitrarily many digits of pi or e, whereas the shortest program for a random trillion-digit number x is "PRINT x," or something along those lines, so the Kolmogorov complexity per symbol in the one case goes to zero, and in the other case goes to one. The algorithmic information is also, in general, uncomputable, in a very strong sense ultimately tied to Gödel's theorem. The rest of chapter eight reviews a number of other complexity measures (most notably Bennett's logical depth, the running-time of the Kolmogorov program) which patch some of the bugs with Kolmogorov complexity, or extend the realm of application, but ultimately judges none to be satisfactory. The real purpose of this chapter is to soften us up for the next one.

Ultimately, in chapter nine, we abandon reproducing a single pattern exactly; instead we want to capture the statistical regularities in an ensemble of patterns, and we want to do this by inferring models from samples of the ensemble. (Here the intuition that pure randomness --- say, coin-tossing --- is really simple comes into its own: to model coin-tossing, toss a coin.) Now, this opens out onto a very large field, which the authors rightly show trepidation about entering: it is, depending on one's point of view, the problem of machine learning, or of statistical inference, or even that of hypothesis and induction ("And at my back I always hear/ Hume's winged chariot drawing near"). Quite rightly, they do not claim to be setting forth a general theory of model inference; rather, they present some constructions which will probably work on ensembles with a "hierarchical" nature, where "strings of length n1 » n0 > 1 contain sufficient information about substrings of length upto n0 [to determine their properties], strings of length n2 » n1 about those of maximum length n1, and so on.... The rules ... are learned gradually, from the most elementary to the most involved, and the accuracy of the description increases with the level of the approximation." All the complexity measures in this penultimate chapter essentially say how slowly successive models converge as the order of approximation increases (or, analogously, what happens to the error rate as the level of resolution is turned up) --- the slower the convergence, the higher the complexity. The Santa Fe house style for deducing complexity from inferred models is somewhat different, less topological, and to my eyes more elegant and less ad hoc, but Badii and Politi's work is clearly that of kindred spirits, and I half suspect the two approaches will turn out to be complementary in the end. They close with a few pages about the problem of extending the techniques they have described to spatially-extended and quantum-mechanical systems, and a little more hand-waving, to ward off the evil eye and gods looking for hubris to punish.

Textbooks, if we may believe the historians and sociologists of science (always doubtful), serve a dual purpose: not only do they vex and torment the young, but they help define a field, and fix the body of knowledge, assumptions, hunches, fallacies, tricks and prejudices common to all who cultivate it. Their appearance is a sign of disciplinary maturity: not only are there callow minds to warp, but there is some agreement about the warp to give them.

From this point of view, Badii and Politi's book, though not officially a textbook, is a welcome sign that complexity is growing up, and sometime soon might be left alone in the house for a weekend without fear of returning to something out of nightmares. Like its companion volumes in the Cambridge Nonlinear Science series, it has the look and feel (and alas price) of a modern graduate-level text in physics, including the authors' quite comprehensible but definitely unidiomatic (on rare occasions, e.g. the second sentence on p. 277, ungrammatical) English, but with a much more attractive cover than most. The only real impediment to using it in a one semester course, easily extended to one year by covering various ancillary topics in more detail, e.g. using Beck and Schlögl's book, is the absence of exercises (which also limits its value to self-study), though checking and completing the examples would serve as a partial substitute. I wouldn't want to teach such a course to those who hadn't previously been exposed to nonlinear dynamics, or who were unfamiliar with statistical mechanics at the level of Part I of Landau and Lifshitz, say second year graduate students in physics and applied math; but it is, hands down, the best book currently available to teach such critters about complexity, and even more seasoned, not to say jaded, researchers will find it useful as a reference.

Disclaimer: Dr. Badii was kind enough to look over this review, point out some goofs, and explain his cover to me. I have, however, no stake, financial or otherwise, in the success of Complexity.
xiii + 318 pp., lots of black and white diagrams and graphs, well-organized bibliography, subject index
Cellular Automata / Physics / Self-organization, Complexity, etc.
Currently in print as a hardback, US$74.95, ISBN 0-521-41890-9 [buy from Powell's], and as a paperback, US$40, ISBN 0-521-66385-7 [buy from Powell's]. LoC QC174.85 S34 B33
23 April 1998
Thanks to Dave Feldman and Nigel Snoad