The Bactra Review: Occasional and eclectic book reviews by Cosma Shalizi   162

The Nature of Computation

by Cristopher Moore and Stephan Mertens

Oxford University Press, 2011

Intellects Vast and Warm and Sympathetic

The best way to understand what this book is about is to begin with its first examples. I hand you a network or graph, and ask whether there is a path through the network that crosses each edge exactly once, returning to its starting point. (That is, I ask whether there is a "Eulerian" cycle.) Then I hand you another network, and ask whether there is a path which visits each node exactly once. (That is, I ask whether there is a "Hamiltonian" cycle.) How hard is it to answer me?

On the surface, these seem like extremely similar problems. Both can be solved by brute force, by enumerating all possible paths up to some length (the number of edges in the first case, the number of nodes in the second), and seeing whether any of them meet the constraints. Brute force here is truly brutal: the number of paths to check grows exponentially (with the number of edges or of nodes, respectively). The first problem, however, has a short-cut, which was found by Euler: there is such a cycle if and only if the number of edges meeting at every node is even — if they all have even "degree". (This is because we must leave every node on a different edge than we reached it.) If we compare the time needed to check whether the nodes have odd or even degree to the time needed to enumerate paths and check them, we find that the former grows merely linearly. So the Eulerian-cycle problem turns out to be comparatively simple to solve even for very large networks. There is some structure to the problem which mean we don't have to search the whole space of possible paths.

On the other hand, the problem of visiting each node just once, the Hamiltonian cycle problem, has, as far as anyone knows, no such short-cut. One can certainly do things to avoid having to examine every possible path (for instance, ruling out paths early, if they get to some point where they couldn't possibly complete a cycle without re-crossing a node). But even the cleverest approaches anyone has found, even after using every known trick to cut down the search space, still amount to examining an exponentially large number of possibilities one by one.

Suppose, after letting you struggle with the Hamiltonian cycle problem for a while, I decide to put you out of your misery by telling you that there is a cycle. You are, naturally, suspicious and resentful by this point, and demand that I prove it by showing you the cycle. If I tell you a sequence of moves which I claim is a Hamiltonian cycle, you can check that it is quickly and easily, by tracing the path through the graph and seeing whether it follows the rules. So there is a huge difference, for the Hamiltonian cycle problem, between how hard it is to find a solution, and how hard it is to check a solution1.

The contrast between the Eulerian-cycle problem and the Hamiltonian-cycle one turns out to be fundamental. On one side of the divide, with Eulerian cycles, are all the problems where the time needed to work out the solution from scratch is grows only polynomially with the size or complexity of the problem; this class of problems is called P. On the other side, with Hamiltonian cycles, are problems where positive solutions can be verified "in polynomial time". This class is called NP, for "nondeterministic polynomial", the idea being that one could solve the problems in polynomial time if one just had a magical ("nondeterministic") Oracle which provided answers, and one double-checked them out of caution. These are not exhaustive; beyond NP would lie problems where, say, even verifying a solution from the Oracle is itself an NP problem, and so on.

We can prove that a problem is in P by actually finding a method for solving it, and showing that it takes only a polynomial number of steps; detecting Eulerian cycles is clearly in P. And it's (naturally!) easy to verify2 that any problem in P is also in NP. But this leaves a troubling gap: how, if at all, do we know that detecting Hamiltonian cycles is in NP but not in P? How do we know that any problem is in NP but not in P?

The fact that smart people have been looking for fast ways to solve the Hamiltonian cycle problem since the mid-1800s and not got very far is not much of an argument. (What, after all, is a mere 150 years in the history of mathematics?) It could be that there is some brilliant trick which will simplify the Hamiltonian-cycle problem, the same way that checking whether all nodes have even degree simplifies the Eulerian-cycle problem. If there is such a trick, then the Hamiltonian cycle problem really belongs in P, and we are just too immature to see it.

But detecting Hamiltonian cycles is not just any old exponentially-hard problem. It turns out that it is possible to turn any NP problem into an instance of detecting Hamiltonian cycles. (We can build a special graph which encodes the original problem, and where there is a Hamiltonian cycle if and only if that problem has a positive answer. This can be done rapidly, i.e., in polynomial time, and without knowing the answer to the original problem.) Problems like detecting Hamiltonian cycles, whose solution would let us solve all problems in NP, are NP-hard. If, like detecting Hamiltonian cycles, an NP-hard problem is itself in NP (as opposed to being in some yet higher and harder class), it is NP-complete.

Some NP-complete problems are mathematical curiosities like detecting Hamiltonian cycles. Others, however, are problems with real-world applications, especially optimization problems, such as the famous traveling salesman problem. While linear programming with continuous-valued solutions is in P, it can become NP-complete if solutions are forced to be integers. One far-ranging NP-complete problem is what's called "satisfiability", determining whether some set of logical constraints on a collection of variables can all be satisfied at once. The ability to solve NP-complete problems in polynomial time would be tantamount to automating mathematical discovery itself (as Moore and Mertens discuss).

By this point, we have seen all of the characteristic features of computational complexity theory. The basic objects are problems and methods for solving them. Problems are classified according to the resources needed to implement methods of solution — or, more precisely, by how the required resources scale with some measure of the size of the problem. Complexity classes are mathematical but qualitative distinctions (polynomial, or faster than polynomial?), which are preserved under translations or transformations of the problems. Checking a candidate solution is generally much easier than devising them from scratch; the latter requires some sort of actual insight into the problem. Most complexity classes have complete sub-classes, which can be efficiently translated into all other problems in the class, so that a method to solve any of the complete problems cracks open the whole class. Perhaps most fundamentally, none of this has anything to do with computer hardware. Moore and Mertens repeat the joke that theoretical computer science is no more about computers than astronomy is about telescopes, but the analogy is imperfect. To really do astronomy, after all, we do need instruments like telescopes, and an understanding of optics. It would be fairer to say that computational complexity is no more about computers than Euclidean geometry is about rulers and compasses.

What then is computational complexity of theory about? It is a study of explicit methods and procedures; of problem-solving; of the limits of (one kind of) intellect. Nothing examined as an object of the theory, as a "problem", would, in 1900 or even 1940, have been considered as anything less than an exercise of intelligence. (Consider how many of them come from games.) The perspective taken is to ask for genuinely methodical ways of solving the problems, to ask for algorithms, and to care not so much about the actual method as about the fact that there is some method, and about the resources which would be required by any algorithm3 which can do the job. That there is anything non-trivial to say about this at all is surprising and wonderful, and I think computational complexity should be seen as one of the great triumphs of twentieth century science.

The Nature of Computation began life as an exposition of the links between computational complexity theory and physics, which is not as strange as it might seem. I have been talking about the worst-case complexity of problems — the mythology is that the Adversary4 crafts instances of the problem to force us to work as much as possible. One can also, more benignly, imagine that problems are drawn at random from some distribution, and ask about the average-case complexity. The average-case and worst-case complexities for many NP-complete problems have a curious relationship, most easily illustrated for satisfiability problems. What makes an instance of satisfiability hard to solve is that each variable is subject to multiple constraints involving other variables, so that trying to meet one constraint nearly-inevitably messes up the others. Say that each constraint involves just three variables. When the number of constraints per variable is small, say on average two constraints per variable, then most instances can be satisfied, and finding a solution is easy, the average-case complexity is only polynomial. If the number of constraints per variable averages, say, six, then in most instances all of the constraints cannot be satisfied, the typical problem is intractable, and the average-case complexity is (apparently) exponential. Even more strangely, the switch-over is abrupt, with a small increase in the density of constraints leading to a drastic increase in computational complexity; the transition becomes sharper and sharper as the number of variables grows. (The critical density of constraints per variable appears to be 4.267, but is not yet exactly known.) This is strongly analogous to phase transitions such as freezing in physics, and tools from statistical mechanics explain a lot about this cross-over.

Other connections between computational complexity and theoretical physics need another few words of explanation. So far I have been talking about the time required by algorithms, measured in the number of basic steps or operations they need to perform. The bulk of computational complexity theory is indeed about time complexity, which is only appropriate for mortal creatures like us, but time is not the only resource an algorithm needs. "Space", or really memory for inputs, intermediate results, and outputs, is another crucial one, though usually less pressing. For example, to find Hamiltonian cycles, we don't need to keep all possible paths in memory at once; we just need to enumerate them, in some order, and remember a limited amount of intermediate work about the current path. (Since finding Hamiltonian cycles is NP-complete, all NP problems can thus be solved using only polynomial amounts of memory; they are in PSPACE.) Some methods might call upon multiple processors, i.e., multiple centers of calculation; then interaction between the processors becomes a resource. We might be content with not getting quite the right answer — this is especially the case with optimization problems — and then the degree of approximation, of acceptable slop, is a resource.

Time, memory, interaction, and approximation are not the only resources, and the connections to physics really come from others. Well-defined algorithms can also make use of some external source of randomness — "At this step, flip a coin, and if it is heads then...". The point of doing this, usually, is to get a high-quality approximation to the right answer with high probability, much more quickly than an certain and exact answer. (Sometimes it is to evade traps which the Adversary might set for a merely deterministic method.) Calls to a source of random numbers are then another resource, and there are deep questions about how many are needed, and when they can be replaced by invocations of some deterministic source of pseudo-random numbers. At the frontier, we go beyond classical randomness altogether, and start using quantum mechanics itself as a resource. Despite the impression given by popular accounts and promoters, quantum computers cannot in general solve NP-complete problems in polynomial time (unless P=NP, of course), but they can solve some very particular problems faster than any (known) classical computer, and they are fascinating objects in their own right.

While the first impulse for this book was to expound topics like phase transitions in NP-complete problems, quantum computing, and randomness as a resource, where there are connections to theoretical physics, "the tale grew in the telling". No real knowledge of physics is required even to follow those chapters; rather, they are very skillful introductions to the necessary elements of statistical and quantum mechanics.

This is, simply put, the best-written book on the theory of computation I have ever read; one of the best-written mathematical books I have ever read, period. I am horribly biased in its favor, of course — Cris is a mentor and collaborator, and, even more, an old friend — but from beginning to end, and all the 900+ pages in between, this was lucid, insightful, just rigorous enough, alive to how technical problems relate to larger issues, and above all, passionate and human. (There were many pages where I could hear Cris, full of enthusiasm for the latest puzzle to catch his attention and wanting to share.) I recommend this most strongly to anyone who remembers a little calculus and how vectors add, whether you think you care about computational complexity or not.

1: It may have occurred to you that I have only talked about how easy it is to verify a positive answer to the Hamiltonian cycle problem, by checking a "witness" to the existence of such cycles. I leave it as an exercise to the reader to decide whether there are witnesses to the non-existence of Hamiltonian cycles, and, if so, how easy they might be to check. ^

2: The Oracle's solution can be verified in polynomial time by solving the problem yourself, and seeing if the Oracle agrees with you. ^

3: If one gets serious about this, one is forced to be clear about what counts as "an algorithm", and this leads into models of computation, especially to universal models of computation — ways of formalizing algorithms which, provably, can emulate all of the others. That such universal models exist is not at all obvious, but is rather one of the great discoveries of computer science. The oldest and most basic of these are the Turing machines, but the wonderful thing is that it doesn't really matter which universal model of computation one uses. ^

4: I have no idea why theoretical computer science insists on using a vocabulary which makes it sound like a bad fantasy novel (i.e., the kind I read). Perhaps it's some lingering influence of Norbert Wiener? ^

Hardback, ISBN 978-0-19-923321-2, official website

Computers and Computing; Mathematics; Physics

Finished 2 September 2012