January 24, 2006

Graphs, Trees, Materialism, Fishing

Attention conservation notice: Over 6000 words of dubious value. Written as a contribution to the Valve event on Franco Moretti's Graphs, Maps, Trees. Contains advice to literary scholars from someone utterly unqualified to give it, hypothesis testing, rastergrams, many long quotations, and ruminations on materialism and rational history. If this is the sort of thing you're interested in, you'd really be better off reading Moretti yourself.
Cross-posted to the Valve, where there is a comments section. I can't promise to reply...

A few years ago, I wrote a review of Moretti's Atlas of the European Novel, in which I presumed to tell him how to go about his business. When he ran across it, his reaction was not (as mine would've been, had our situation been reversed) to tell me where to get off, but to invite me to a workshop he was organizing at Stanford on new interdisciplinary work on the novel — its motto, the quotation from Brecht about "questions that appear to us completely unsolved", is recycled for this book — where I had a great time. Reading these essays as they came out in New Left Review, I enjoyed them greatly, and recall thinking that Moretti could hardly have done a better job of appealing to my prejudices if he'd tried. (Said prejudices are those of someone almost equally fond of The Extended Phenotype and Main Currents of Marxism.)

With this kind of background, it comes as no surprise, I trust, that I really like this book, and find objecting to what he actually proposes here highly wrong-headed. In what follows, I want to say a bit about "graphs" and a bit about "trees", and explain why this sounds so promising to me. I am not going to say anything about "maps", because I don't think I have anything to add to that discussion, but I will, for the sake of getting an M in there, end with some remarks on "materialism". At no point can I pretend to be competent to evaluate the originality of Moretti's work within literary scholarship, to say how much of a departure, say, the trees really are. In a feeble attempt to pretend that my price is higher than a weekend in California and a review copy, I will make some criticisms, most about tedious extra stuff I wish Moretti had also done. I'd like to think that what I say will also have some value for those who don't share my rather haphazard intellectual trajectory, but my experience with trying to communicate across disciplines means I'll get a warm glow if I'm even comprehensible, never mind persuasive. I am accordingly very grateful to the Valve, and especially to Jonathan Goodwin, for letting someone with my credentials (viz., none) participate in this event.

Graphs

Do Genres Come in Bunches?

Moretti makes a very striking claim in his first chapter: that genres of novels appear together, in clusters, separated by about 25 years, and disappear together too. Looking at his graph, my eye agrees, but my eye also tells me that there are faces in clouds (the East African Plains Ape is an incorrigible pattern-finder), and probability theory tells me that purely random processes can produce a lot of apparent clustering and regularity. What reason is there to think that what looks like genres coming in clusters isn't just coincidence?

Let's be a little more precise about what we'd mean by "chance" and "coincidence" here. One natural possibility is that new genres appear at a constant rate over time, utterly independently of one another. Every year, then, there would be a constant probability of a new genre forming, but whether it did or not would have no bearing on whether the next year saw a new genre. This is our null model — the one which says what things should look like if we're just fooling ourselves, and there are no clusters. To get slightly technical, the distribution of intervals between genre-arrivals should have what's called a geometric distribution. Assuming, for the sake of argument, that that's true, we can use the average time between genre-appearances (3.44 years) to estimate the most likely value for the probability of a new genre appearing in any given year (about 29%).

Once we assume that the inter-arrival distribution is geoemtric and find the parameter, we can simulate from it, and get examples of what Moretti's graph would look like, if only chance were at play.

The top line shows the appearance dates of Moretti's 44 genres; the next two lines give the results of simulating from a model of uniform random appearance, with the same mean time between genres as the actual history.

Is there more clustering in reality than in the results of the null model? I couldn't say, by eye, but I don't have to. I can calculate the probability of generating Moretti's history from the null model: it's somewhat less than 1 in 10^45. This in itself isn't decisive, since any particular history becomes less and less probable as one considers longer and longer intervals of time (cue Stoppard), so we need to know what fraction of all histories of that length are at least that unlikely. I could work this out exactly, if I were willing to do some actual math, but I'm lazy, so I just had the computer simulate a million histories and evaluated all of their likelihoods. If the null model were actually true, we'd see histories like Moretti's only about 0.4% percent of the time. [1] So this is actually pretty good evidence that the null model is not true, and Moretti's history does show the kind of clustering he thinks it does.

Of course, this only underlines the question of why Moretti's data is clustered. I can think of a couple of deflating explanations (maybe the clusters match the periods more intensely scrutinized by historians; maybe they tend to adjust when they report genres appearing towards certain focal dates). Or it could be due to some sort of exogeneous influence, from war, politics, economic shifts, etc. (I did not try removing the obviously-topical genres, like Chartist novels, and repeating the analysis.) Or it could be due to some sort of endogenous mechanism within the system of literary production and consumption — generational turn-over of authors, of readers, of editors and publishers (suggested by my friend Bill Tozier). Or: maybe there's some space of things-people-like-in-novels, which the popular genres at any one time partition up in various ways; if one genre dies out or another appears, this might destabilize all the others as well. I don't think Moretti's time series, by itself, is enough to begin to let us decide among these mechanisms (some of which are compatible), but I do think it lets us see that some mechanism is called for.

Here is my first reproach: Moretti should have been the one to do this analysis, not me. If testing hypotheses is too banausic and mechanical for the pages of New Left Review, then it should either be in another article, or in the book. Moretti is a shrewd man, and in this case his intuitive analysis of the data was right, but there is no reason to rely on intuition alone for something like this. And, if one is going to go to the trouble to collect quantitative data, one ought to use it quantitatively. Mathematical abstraction (quantitative or otherwise) is not valuable for its own sake, but for the inferences it lets us make, when the proper tools are applied. In this case, those tools are pretty easy to bring to bear. They should be.

Dissolving Genre History

Here is Moretti at the end of "Graphs":

For most literary historians ... there is a categorical difference between 'the novel' and the various 'novelistic (sub)genres': the novel is, so to speak, the substance of the form, and deserves a full general theory; subgenres are more like accidents, and their study, however interesting, remains local in character, without real theoretical consequences. The forty-four genres of figure 9, however, suggest a different historical picture, where the novel does not develop as a single entity—where is 'the' novel, there?—but by periodically generating a whole set of genres, and then another, and another... Both synchronically and diachronically, in other words, the novel is the system of its genres: the whole diagram, not one privileged part of it. Some genres are morphologically more significant, of course, or more popular, or both—and we must account for this: but not by pretending that they are the only ones that exist. And instead, all great theories of the novel have precisely reduced the novel to one basic form only (realism, the dialogic, romance, meta-novels...); and if the reduction has given them their elegance and power, it has also erased nine tenths of literary history. Too much.

On the one hand, this seems to me to be obviously correct. On the other hand, I wonder very much why Moretti stops here. If we look within any one of those forty-four genres, I think we have every reason to suppose that we'd find it composed, in its turn, of sub-genres, and so on, and ultimately of a shift succession of individual texts. "The" Bildungsroman (to pick one of the forty-four, not entirely at random) is a short-hand way of referring to the most common and enduring features of a historically-changing and always-various population of books, just as "the" bottle-nosed dolphin is an abbreviation for the leading tendencies of a certain population of organisms. What Moretti hints at, in the paragraph I quoted, is that "the" novel is itself a population, either of genres, or of texts structured into genres. But he doesn't say outright what seems very plain to me, and so I'd like to know why, and specifically whether he thinks it's actually wrong, or unhelpful.

The assumptions of population thinking are diametrically opposed to those of the typologist. The populationist stresses the uniqueness of everything in the organic world. What is true for the human species—that no two individuals are alike—is equally true for all other species of animals and plants. Indeed, even the same individual changes continuously throughout its lifetime and when placed into different environments. All organisms and organic phenomena are composed of unique features and can be described collectively only in statistical terms. Individuals, or any kind of organic entities, form populations of which we can determine the arithmetic mean and the statistics of variation. Averages are merely statistical abstractions, only the individuals of which the populations are composed have reality. The ultimate conclusions of the population thinker and of the typologist are precisely the opposite. For the typologist, the type (eidos) is real and the variation an illusion, while for the populationist the type (average) is an abstraction and only the variation is real. [2]

This makes salient the question of how we mark off different populations as distinct. The usual biological criterion is through common descent, and the possibility of inter-breeding —- Mayr's "biological species concept". (There is a vast controversial literature on the details.) Ruth Garrett Millikan has a closely related notion of "reproductively-established families", which doesn't lean so heavily on the details of biology, and which would seem to fit the case of genres of novels. One could also define classes of texts purely morphologically, which might include many unrelated lineages (just as one might consider all streamlined marine predators which live in the water all the time, a class including dolphins, killer whales, sharks, tuna, ichthyosaurs, etc.). Just as such organic forms have appeared in several lineages, morphologically-defined categories could appear in multiple places and periods, the way novels arose, apparently quite independently, in both the Hellenistic world and in China (and elsewhere, for all I know). Historical populations, however, are unique.

Trees

One could ... take evolutionary bibliography as the prototypical evolutionary science and think of biology in terms of bibliographic analogies... [3]

The Cabinet of Horrors

When trying to explain cultural change and cultural variation, people have generally sought to do so by supposing culture is causally driven by something else (the climate, the social structure), or, even more strongly, that it is adapted to something else, or, more strongly yet, that it functions adaptively for the benefit of something else (here social structure, or ruling classes, are favored as suspects over the climate). This has led to an awful lot of (if I may use the phrase) adaptationist just-so stories, and uncritical analogy-mongering on a level with the sort of thinking which leads rhinoceros horn to be prescribed for impotence. Jon Elster is worth quoting at some length:
In his comments on the links among capitalism, Protestantism, and Catholicism Marx set a disastrous precedent for many later writers who have attempted to find "structural homologies" or "isomorphisms" (two fancy terms for "similarities") between economic structures and mental products. Because virtually any two entities can be said to resemble each other in some respect, this practice has no constraints other than the inventiveness of and ingenuity of the writer: There are no reality constraints and no reality control.
Marx suggests two inconsistent lines of argument. One is that there is a strong connection between mercantilism and Protestantism, the other that there is an elective affinity between mercantilism and Catholicism. He was confused, apparently, by the fact that money has two distinct features that point to different religious modes. On the one hand, money (gold and silver), unlike credit, can be hoarded. Hoarding easily turns into an obsession, which is related to the fanatical self-denying practices of extreme Protestantism. On the other hand, money can be seen as the "incarnation" or "transubstantiation" of real wealth. In that sense the money fetishism associated with mercantilism is related to the specifically Catholic practice of investing relics and the like with supernatural significance. Both arguments are asserted several times by Marx, each serving to show up the essential arbitrariness of the other. Later attempts to explain the theology of Port Royal, the philosophy of Descartes, or the physics of Newton in terms of similarities with the underlying economic structure are equally arbitrary. Like the analogies between societies and organisms that flourished around the turn of the century, they belong to the cabinet of horrors of scientific thought. Their common ancestor is the theory of "signs" that flourished in the century prior to the scientific revolution inaugurated by Galileo — the idea that there are natural, noncausal correspondences between different parts of the universe. What Keith Thomas refers to as the "short-lived union of science and magic" maintained a subterranean existence of which the doctrine of ideology, in one of its versions, has been one manifestation. [4]

Even if we shutter and lock the Cabinet of Horrors, and go to look for explanations of trends in such cultural products as novels (which is, after all, what Moretti wants), I'm afraid we will find most of them in the capacious Closet of Mildly Appalling Objects. There is no shortage of attempts to give such changes meaning as signs of something else, some aspect of the social or economic structure, of the way we live now (or the way they lived then), but very, very few of them are convincing. In his great book on changing fashions, A Matter of Taste, the sociologist Stanley Lieberson looks at some of the reasons why these attempts at ad hoc explanation are so often bad. (He puts things more politely; I paraphrase.) First, the facts are often just screwy, both about the developments to be explained: non-existent trends, non-existent causes, weirdly mis-characterized trends, trends being explained by events which happened long after the former began, etc. (In fairness, such "scholarly misconstruction of reality" is a lot more common than we academics like to think.) Second, the mechanism connecting the explanatia to the explananda is left totally obscure. Third, no attempt is made to test the explanation, by checking that it can account for the magnitude of the observed change, by ruling out alternative explanations, or by much of anything else. The result is a steady stream of claims about how culture works which are advanced with what is, under the circumstances, an astonishing degree of assurance. Lieberson's book provides many fine examples of such cavalier just-so story-telling for names, the decline of hats, etc. [5]

Checking hypotheses about causation, and still more about adaptation, is really hard with just one case, arguably hopeless. What you need is the ability to reliably detect departures from the hypothesis, if they are actually present — "power", in the statisticians' jargon. It is hard to get much power when n=1. If you want to claim that certain aspects of 19th century British novels were the way they were because those features fitted with ideologies of British imperialism — a fairly strong hypothesis about adaptation — I don't see how you can do it just by interpreting Mansfield Park, no matter how subtle and sophisticated your reading. On the other hand, if you look at lots of contemporary novels, and the ones which (say) depict Great Britain's relations with its colonies in the same way as Mansfield Park does are systematically more successful, on average, than those which depict it differently, well then I don't see how that couldn't be good news for your idea, though even that would really only be the beginning of backing it up.

Biologists have given a lot of thought to checking hypotheses about adaptation, and developed many means of doing so. Mutatis mutandis, many of these means could also be applied to literature, or other aspects of culture. Eric Rabkin, Carl Simon and their collaborators have started doing just this with their Genre Evolution Project, looking at short stories from 20th century American science fiction, and no doubt there are others doing this kind of thing too.

One way of checking adaptive hypotheses, especially relevant here, is the "comparative method", or rather methods, which work much, much better when combined with good phylogenies. I think a literary historian who wants to study the evolution of genres and devices would be very well advised to look at the comparative methods biologists employ to study the evolution of qualitative characteristics of organisms. (The major issue would be that literary phylogenies will not be trees but more complicated lattices. But this is analogous to the effects of lateral gene transfer, common among bacteria, and so I'd suspect not only solvable but solved, someplace in the literature. Whether inheritance is by means of discrete-valued, particulate factors, i.e., genes, is not a crucial issue for such methods.) What I really want to see from Moretti (or someone) is a study along these lines of clues in the detective story; I'd be even more interested in one of free indirect discourse.

A crucial aspect of testing hypothesis about adaptation is a contrast with the outcome of a well-crafted neutral model — a way of saying what to expect if no adaptation were present, or not that adaptation anyway. These often have surprising consequences; for instance, neutral genetic drift will tend to fix some version of a gene in a given population, even if it confers no fitness advantage. (This is described in any book on population genetics.) So I wonder about things like whether we should expect, under a reasonable neutral model, that some formal device should become universal within a genre? If so, did clues take over detective stories any faster than neutrality would predict? (It's hard to imagine a successful genre where every story relies on confessions found by accident, but whether that's intrinsically weirder than actually existing detective stories, I can't say.)

The foregoing shouldn't be taken to mean that comparative literature should slavishly imitate comparative biology. There are people who have thought about the application of evolutionary ideas to social and cultural change in ways which are much more sophisticated about psychology, social organization and human interaction than (most) advocates of memetics; I am thinking particularly of David Hull, W. G. Runciman, Dan Sperber, Stephen Toulmin's great The Collective Use and Evolution of Concepts, and even the fragmentary MS. of Adam Westoby. As the economist Richard Nelson writes, we should expect our ideas of general evolution to change as we learn more about cultural evolution. We should also expect to have to develop different methods of data analysis. But, as always, we start with what we already know how to do.

Materialism

I share Moretti's hope for a "materialist sociology of literary form"; Hell, I'd like a materialist sociology of culture generally. But I suspect it won't be able to do everything he wants it to.

When Moretti quotes D'Arcy Thompson on how the form of an object is a diagram of the forces which produced it, I'm happy to go along, and even happy to agree that this gives us some ability to work backwards, from form to force. But this sort of inverse problem generally doesn't have a unique solution, especially if some of the forces were transient and highly contingent... Less metaphorically, something Lieberson argues very convincingly is that we often have to distinguish between the social forces causing there to be a change in some taste, and those which shape the content of the new taste. Often the latter mechanisms are more or less internal to the bit of culture in question, like ratcheting. Or: culture doesn't have to express or reflect the social order. I suspect Moretti would be disappointed if this were the case for, say, genres of novels. Well, so would I. But this needs to be checked. One way would be to try to develop good neutral models, and see whether, and where, they break down

Dan Sperber has a great essay, in his Explaining Culture, on "how to be a genuine materialist in anthropology", where he complains about treating Capital, the World-System, cultural symbol-systems, mentalities, etc. as reified causal forces, if not self-interested foresightful agents, forgetting that human history, society and culture are actually "real individuals, their activity and the conditions under which they live" (to appropriate a once-famous line). It seems, at least to this interested outsider, that the study of literature in society suffers from this, too. And I think what Sperber advocates there should go here, too: give actual causal accounts of how macroscopic patterns emerge from the interaction of many material bodies (notably, people and books), of the sort we know to exist, endowed with the kinds of abilities we know them to have.

This commitment may sound harmless, because contentless, but it does actually have implications. It means that you have to do a lot of work to justify functionalist explanations (though it's not impossible). It should make you very dubious about ideal types. It should make you more interested in exploring variation, and not dismissing it. It should make you very dubious about "practices" and other shared mental objects, at least as ordinarily conceived. And it suggests a lot of productive directions, investigating communication, cognition, and the collective patterns they produce.

In Graphs, Maps, Trees, as in his Atlas, Moretti is basically looking at the communication end of things. He doesn't say much about cognition, or individual thought more generally. Elsewhere (see e.g. Signs Taken for Wonders) he has dabbled in psychoanalysis, but I hope that's past. A materialist theory of literary form will ultimately have to concern itself with the organic processes of reading and composition, but the way to do this is through empirical study of readers and writers, not more interpretation of texts, or armchair ruminations (whether those are on the primal scene, the environment of evolutionary adaptation, or conceptual blending). Of course literary scholars have been making stabs in this direction at least since Richards's Practical Criticism, but with the advent of cognitive psychology this can be done in a much more systematic way, combining modeling of cognition with experimental tests of the models. [6] Again, many people (e.g., Jerry Hobbs, Herbert Simon) have been proposing this for some little while, but it's only recently, with works like Bortolussi and Dixon's Psychonarratology, that people have begun to actually do it, taking the predictions of various theories of narrative, which say that changing stories in certain ways should affect readers' responses, and seeing whether that's actually right. This, and not desk-bound speculation about analogies, seems to me the proper way to start on a cognitive psychology of literature. It is obviously complementary to what Moretti wants to do, and (this is the sweet part) the two enquiries can be pursued in parallel; neither has to wait for the other.

One thing Moretti does not do, anywhere, is construct models linking interacting individual behavior to aggregate patterns. Economists and sociologists already make such models, and anthropologists are starting to do so. It may be premature here, but ultimately it will be vital. If different social groups have different beliefs, is that because those beliefs express their relations to the mode of production, or is it because they tend to talk more with in the group than across group boundaries? Adaptationist theories of culture tend to go for the first choice, but we don't really know whether the latter could account for the specific patterns of cultural difference and change that we see.

How Not to Learn from the Natural Sciences

What I said above about not mindlessly imitating biology deserves some amplification.

Evolution ought to have a bad name in the study of literary history. Reading Rene Wellek's "The Concept of Evolution in Literary History" (or his article for the Dictionary of the History of Ideas) is actually quite depressing. (It brings to mind Kurt Vonnegut's line "they deserved to fail, because they were all so stupid".) The many post-Darwinian ventures in this direction went, essentially, nowhere, at least as far as understanding literature better goes. It surely didn't help that their understandings of biological evolution were often very bad, generally some kind of Spencerian or even Lamarckian belief in tendencies of progressive development — perhaps inspiring, but hopelessly un-explanatory. (This has vitiated far too much evolutionary theorizing about social processes; cf. Toulmin's chapter 5.) As for the more recent wave, since the 1980s, the people who seem to think that literature exists because humanity craves dramatizations of Daly and Wilson's Sex, Evolution and Behavior drive me up the wall. (Their idea makes no sense even if you are very sympathetic to evolutionary psychology, which I am.)

Which said, this is not at all what Moretti is proposing, and I don't see the harm in trying to make this all fit together as another instance of a general pattern, alongside biological evolution, because they have similar causally-relevant features, and so similar mechanisms are at work. Many people have pointed out, in some detail, that explaining biological processes through the joint action of variation and selective transmission in populations is one instance of a general pattern of historical explanation; Toulmin is particularly clear on this [7]. There is a demography of businesses, of interest groups, even of medieval manuscripts of classical works, and so why not one of literary texts? Inheriting discrete, particulate hereditary factors from a small, fixed number of immediate ancestors is not the sine qua non of this form of historical explanation, though the details of the process of inheritance will very strongly affect the character of the resulting dynamics. It might be that theories of literary change cast in this form are too complicated to be useful, or that we just don't know enough yet to find the useful ways to formulate them. But it wouldn't hurt to seriously try, and we'd learn a lot, no matter the eventual outcome.

Varieties of Rational History

One way to take the bit from Braudel about "a more rational history" that Moretti adopts as a motto is simply to hope that literary history will be a rational enterprise. There are various aspects to this — the accumulation of knowledge, a desire to give explanations, a realization that more than one explanation might be possible and a desire to check which one is right, and so on. To do all this, it's important to develop, use and refine reliable methods of inquiry — ones which are unlikely to lead you into error, and where errors are apt to be self-correcting. You want to be able to persuade others, and you want to know that you're not just persuading yourself. As a statistician, my job is to help with that bit, so it looms large for me. I think this is more or less what Moretti has in mind when he talks (elsewhere) about wanting "falsifiable" literary history — for ideas which have enough content that they can not only be communicated from one person to another (without tripping Liberman's detector), but checked. Which said, I wish that here, as in his Atlas, Moretti had done a more systematic job of checking his conclusions. Would it be unfair to suggest that, while he sees the need for data analysis, it will be left to a successor generation to put it into routine practice?

If you want to say that asking literary history to be communicable, testable and reliable is asking it to be scientific and that's icky, well, it's a free country (at least for now). The more I think about what makes something a science, the less that seems like an important question. But whether something is a rational enterprise of inquiry matters. I'm sure it's possible to object to wanting history to be more rational in this sense, but I find that thought so alien and pointless I won't even try to engage it.

Another take on "rational history" is that the vast mass of details in small-scale history are essentially random, or, more exactly, the connections among them are as convoluted and involved as the details themselves. (This is one way to define randomness, mathematically.) But looking at larger scales, the randomness averages out, leaving regularities which are simpler and more nearly comprehensible by finite minds, and more reliable. As a statistical physicist and a statistican, I am the last to disagree: "In fact, all epistemological value of the theory of probability is based on this: that large-scale random phenomena in their collective action create strict, nonrandom regularity." [8] The small-scale details of literature and of human life have an intrinsic interest and value that is missing from the small-scale detail of molecular chaos, so there is certainly all the room in the world for what Moretti would like to do and close reading, and even essayistic appreciation. (But there is not, I am afraid, room enough in the world for Harold Bloom.) Whether there is room in an academy organized around the production of peer-reviewed research findings for all of them, is fortunately not a question I need to have an opinion on.

Finally, you might be tempted to go from the last sense of "rational" to supposing that large-scale history must be the working-out of some scheme which is "rational" in that it's really deterministic, or even teleological. This would be a mistake. It is not at all hard to give examples of stochastic processes which combine random evolution and feedback, which converge on very nice large-scale regularities, but which regularity they converge on is completely random and indeterminate. [9] Brian Arthur, among others, argues that processes like this are important in the evolution of technology. Is literature like that? I have no idea. But I don't see any reason it can't be, and this needs to be borne in mind.

Go Fish

Let me close by quoting the same paragraph twice, once from the version in NLR, and then again from the closing pages of the book. In both cases, he is enumerating themes which stretch across his chapters.

First, a total indifference to the philosophizing that goes by the name of 'Theory' in literature departments. It is precisely in the name of theoretical knowledge that 'Theory' should be forgotten, and replaced with the extraordinary array of conceptual constructions, —theories, plural, and with a lower case 't'—developed by the natural and by the social sciences. 'Theories are nets', wrote Novalis, 'and only he who casts will catch'. Theories are nets, and we should learn to evaluate them for the empirical data they allow us to process and understand: for how they concretely change the way we work, rather than as ends in themselves. Theories are nets; and there are so many interesting creatures that await to be caught, if only we try.
First of all, a somewhat pragmatic view of theoretical knowledge. 'Theories are nets', wrote Novalis, 'and only he who casts will catch'. Yes, theories are nets, and we should evaluate them, not as ends in themselves, but for how they concretely change the way we work: for how they allow us to enlarge the literary field, and re-design it in a better way, replacing the old, useless distinctions (high and low; canon and archive; this or that national literature...) with new temporal, spatial and morphological distinctions.
Whether this pragmatic message is what Novalis meant, I have no idea; I only know the line because Popper used it as the epigraph for The Logic of Scientific Discovery. But that's what Popper meant by it, and I think it's right, and I look forward to seeing the coelacanths and tube-worms and giant squid which will be brought up from the deeps in years to come.

[1]: More on testing the null model of genre appearance, for those into that kind of thing: Really, of course, the most suitable null model for random appearance would be a continuous-time Poisson process. Since the data are discretized by years, however, I'm faking it by using a geometric distribution of inter-arrival intervals. (I also tried simulating from a Poisson process and then discretizing the result; the results weren't much different.) The only parameter of such a process is the mean inter-arrival time, or equivalently the "intensity", the probability per year of producing a new genre. Simple maximum likelihood estimation gives this as 0.2905405, which implies a log-likelihood for the original data of -103.9498. To evaluate the significance, I generated 1,000,000 sample paths, of the same length as Moretti's, and then for each one re-estimated the intensity and used that to evaluate the log-likelihood. (This sort of "bootstrapping" should account for the fact that I fit that parameter to the data in the first place. It wouldn't be appropriate if, say, Moretti had advanced the conjecture that the mean inter-arrival time should be 10 years on independent grounds.) Of the 1,000,000 sample paths, only 3,802 had log-likelihoods as small or smaller than the original data. That is to say, if the null model were correct, we'd see results like this only about 0.38 percent of the time. So we can certainly reject the null model at the conventional 5 percent significance level, or even the 1 percent level, and in fact this is a considerably more severe test than that.

[2]: Ernst Mayr, What Evolution Is, p. 84, quoting a 1959 paper of his own.

[3]: This is from Sidney Winter's article on "Natural Selection and Evolution" in the New Palgrave Dictionary of Economics (1987), where he works out the analogy in some detail.

[4]: An Introduction to Karl Marx, pp. 183--184.

[5]: "Adventures of a Man of Science", Elif Batuman's wonderfully-titled review of Graphs, Maps, Trees in n+1 magazine, is a quite nice essay, but it also provides what looks like a typical example of the kind of mere plausibility I have in mind:

Perhaps the Holmes stories are not half-baked versions of the "correct" mystery story, but a different kind of mystery story, wherein the nondecodability of clues is not a bug, but a feature. Conan Doyle was writing during the conquest of England by industry and rationalism; perhaps his readers wanted stories about the kinds of magic that are possible within the constraints of science. Holmes categorically rejects the supernatural, not in order to show that the new, rational rules preclude magic, but in order to show that you can still have magic even if you play by the rules. Decodable clues came a "generation" later, with Agatha Christie and the first World War, and became more rigorous after the second—by which time readers wanted to be reminded that the world was still rational. [pp. 146--147]
First of all, it seems bizarre to say that Britain was being conquered by "industry and rationalism" in the 1890s, long after the scientific revolution, the Enlightenment, the Industrial Revolution and all its social consequences, utilitarianism, etc. (Indeed, Mr. Lecky might want to have a few words...) Second, Batuman gives us no reason to think that contemporary readers saw what Holmes did as (pardon the phrase) magic within the bounds of reason alone. Third, even if she were right about the social situation and the cultural product, the hypothesized causal connection is really just another arbitrary analogy, of the sort Elster complained about. Suppose Conan Doyle had been better about using decodable clues than Christie. Would it not then sound just as plausible to say this expresses the triumph of rationalism, followed by a post-war weakening? As it is, Batuman's account seems to appeal, implicitly, to a desire to hang on to older ways of thinking. Either the whole reading public of Britain in the 1890s is being treated, in a grossly anthropomorphic fashion, as a single person, with such a desire, or she is making a quite specific prediction about which readers Conan Doyle appealed to, one which does not seem especially plausible, though it might be tested. (It is utterly unclear whose purposes or needs are invokes by the in-order-to's — Conan Doyle's? his original readers'? society's? — but I fear the worst.) Finally, no attempt is made to check that this is the source of the appeal, nor that the later strict decodability of clues really was caused by the World Wars, for the reasons given. I don't know enough to say that this suggestion is false, or that checking it would be impossible. I don't even want to suggest that a book review in a little magazine would be a good place to do such tests. But it doesn't seem to worry Batuman that there is no support for this idea (yet). — Let me repeat that I like the essay.

[6]: Incidentally, thinking that cognition is computational, and even that its computational architecture is strongly constrained by organically-evolved developmental processes, in no way commits one to denying that thought is also profoundly cultural and historical. Sperber is very good on this, but also see Frawley's Vygotsky and Cognitive Science, or the papers collected in The Elements of Reason.

[7]: Of course it isn't the only pattern of successful historical explanation. Even within the natural sciences, geology and astronomy provide very different ones.

[8]: Gnedenko and Kolmogorov, Limit Distributions for Sums of Independent Random Variables, p. 1.

[9]: More exactly, there are stochastic processes ("urn schemes") where the relative frequencies of different outcomes are guaranteed to converge, with 100% probability, but the ratio at which they converge is itself a random variable, not determined by the initial set-up in any way. The models of lock-in developed by Brian Arthur and his collaborators in the 1980s are urn models, but actually less indeterministic than the classical ones.


Manual trackback: Reprieved; Crooked Timber; Pedantry; Three Quarks Daily; Idiocentrism; An Unenviable Situation ("deeply offensive"); Alan Riddell; Digital Humanities 2011; Waggish

Update, 7 February: Seth Edenbaum has more on why he dislikes this post so much — and why he dislikes me (or at least my online persona; I don't believe we've ever met). I think he's wrong, both about this and about me, but it's only right to point to criticisms.

The Commonwealth of Letters; Writing for Antiquity; Biology; Enigmas of Chance

Posted at January 24, 2006 16:20 | permanent link

Three-Toed Sloth