## May 31, 2016

### Books to Read While the Algae Grow in Your Fur, May 2016

Attention conservation notice: I have no taste.

Amitav Ghosh, Sea of Poppies, River of Smoke and Flood of Fire
Collectively, "the Ibis trilogy", three historical novels centered around the First Opium War. They're beautifully written and the viewpoint characters (of which there are many, weaving in and out of the three books) are all very well-drawn. Beyond that, the setting and the protagonists give Ghosh a chance to depict — "comment on" suggests something more heavy-handed — imperialism, cultural diversity and exchange, free trade, multiple identities, enough varieties of love that cannot be acknowledged that I'd have to think to list them all, desires ditto, gardening, memory, the perils of getting what you want, and much, much else. It's really impressive, even if I was not very happy with the ending, and I will be revisiting it at a more leisurely pace.
Elizabeth Bear, Karen Memory
Mind candy. I am normally a big fan of Bear's writing, but just got through this one. The central feature of the book is the voice of the first-person narrator, Karen Memery (sic), and while this was clearly a labor of love on the part of the author, my reaction to that voice ranged from indifference to irritation. (The character wasn't irritating, her style was.) As for the steampunk setting --- as my friend Henry Farrell once put it, "the goggles do nothing", i.e., it seemed like it would have been very easy to tell a very similar, and no worse, story without those props. Clearly, though, lots of people like it very much, so I will just look forward to Bear's future books.
Elizabeth Hand, Generation Loss
Mind candy: a mystery or literary thriller (or both?). The writing is excellent and the protagonist, a failed New York photographer very much out of her element in Maine, is a very well-realized character (and a complete jerk, with impulses which are much, much worse). There are apparently sequels, which I look forward to tracking down. ROT-13'd, for being both a spoiler and catty: Gubhtu V qb ubcr Pnff trgf orggre nobhg fbyivat zlfgrevrf guna whfg orvat yhpxl jura fur qrpvqrf fbzrbar fbhaqf qhovbhf.
(Picked up on the recommendation of Aunt Agatha's in Ann Arbor.)
Ada Palmer, Too Like the Lightning
This is a deeply impressive effort to take seriously the line that "history is the trade secret of science fiction". That is, Palmer has tried to craft a 25th century which is as strange, as familiar, and as both-at-once-because-that's-not-what-we-meant, as our own time would have seemed to someone from the 17th century. This applies not just to the world-building but also to the story-telling (e.g., the way her narrator is simultaneously speaking to his own future and trying to channel [what he thinks of as] an 18th-century voice). This is, to my mind, exactly the sort of thing good science fiction should do. I hope the example of the effort catches on, though I worry that it will merely be specific inventions which get imitated.
Having enthused about setting and narration, I have to admit to being more ambivalent about the plot. Or maybe plots; there are at least two, one revolving around the high politics of the world, and the other around a young boy who seems to have miraculous powers. Both are hard to summarize, or even describe, and both are left very much unresolved at the end of this book. I find it hard to say whether I like the story, though I was certainly eager enough to keep reading, and am frustrated enough by not knowing what happened next that I pre-ordered the sequel.
ObLinkage: Palmer's round-up of her self-presentations and reviews by others.
Robert Jackson Bennett, American Elsewhere
Mind-candy contemporary fantasy, but of truly exceptional quality. This is in many ways a meditation Lovecraftian themes, transposed to the Southwest and rationalized with "because of quantum" (*). (Spoiler-proofed discussion below.) But it's not just yet another re-hashing of monsters and tropes from Lovecraft, which would only matter for those who are already fans of that micro-genre. Rather it's a work of genuine artistry and originality, as well as a hell of a lot of fun. The only real point at which I see a failure, or at least a lost opportunity, is that if you are going to tell a story which revolves around physicists in northern New Mexico unleashing something monstrous, you really should engage more with the reality of Los Alamos... But, again, as entertainment this is just remarkably good.
Discussion of the Lovecraftian connections, ROT-13'd for spoilers: Oraargg znxrf ab hfr bs Ybirpensg'f fcrpvsvp zbafgref be cebcf. (Gurer ner n srj zragvbaf bs syhgvat, naq n guebj-njnl nobhg "jura gur fgnef nyvta", juvpu frrz yvxr qryvorengr ersreraprf, ohg nera'g pbafrdhragvny.) Gur gehyl Ybirpensgvna ovgf pbzr va jvgu gur crbcyr va Jvax sebz ryfrjurer, v.r., bgure qvzrafvbaf jvgu qvssrerag culfvpny ynjf. (Gur ivfvbaf jr pngpu bs jung gurve jbeyq vf yvxr ner rrevr.) Gurve gehr nccrnenaprf unir gur hfhny pbzcyrzrag bs gragnpyrf naq gur yvxr, ohg zber vzcbegnagyl gurl ner napvrag, vauhzna orvatf bs vaperqvoyr cbjre, jubfr gehr sbezf naq angherf ner zber guna gur beqvanel uhzna zvaq pna fgnaq gb nffvzvyngr. Gurl unir pbadhrerq znal jbeyqf, naq orra nf tbqf gurer, ohg urer gurl ner zber be yrff uvqqra. Naq lrg gurl ner pbzcyrgryl vagrejbira vagb gur uhzna yvsr bs gur vqlyyvp gbja bs Jvax, juvpu fvzcyl jbhyq abg rkvfg jvgubhg gurz. Crbcyr pbzr gb zber be yrff qvfgheovat evghny neenatrzragf jvgu gurfr cbjref, jvgubhg rirel dhvgr orvat ubarfg jvgu gurzfryirf nobhg jung gurl ner qbvat. Naq gur urebvar svaqf guvf jubyr jbeyq obgu ubeevoyr naq snfpvangvat, naq gura qvfpbiref gung fur vf npghnyyl cneg bs vg, zhpu zber vagvzngryl guna nalbar ryfr va Jvax; gung gur ivrjcbvag punenpgre vf va snpg bar bs gur zbafgref vf n irel Ybirpensgvna gbhpu. Gur ybj-yvsr pevzvanyf ba gur obeqref bs Jvax, gur ebyr bs Zban'f onol, naq gur pyvznpgvp qrfgehpgvba bs gur gbja, ba gur bgure unaq, nyy frrz zber yvxr Fgrcura Xvat.
Query: Do Wink and Night Vale communicate?
*: Whereas in Lovecraft they were rationalized with "because of relativity".
Michael Alan Williams, Rethinking "Gnosticism": An Argument for Dismantling a Dubious Category
I first read this by chance in graduate school, when it seemed to me a really good demonstration of how to do properly critical and skeptical, but not nihilistic, intellectual history. (Among other things, this includes admitting when the social component of such history is largely guesswork.)
On re-reading after fifteen (!) years, I still find it the main thesis largely persuasive. To attempt my own summary: the ancient sources which modern scholars label "gnostic" are united neither by clear evidence of a shared tradition or organization, nor even by the reports of the orthodox heresiologists; the supposed "anti-cosmic" attitude, forced alternative of either extreme asceticism or licentiousness, etc., are not supported by the texts (and the latter is a bog-standard accusation of the orthodox against everyone), and seem to be largely modern constructions or interpretations; and that, in short, it would be better to chuck the whole category of "gnosticism" in favor of clearer and more empirical ones, like "biblical demuirgical traditions". (Though Williams doesn't harp on this, one could then investigate, rather than pre-judge, questions like "did all texts with biblical demiurgical myths share a common origin?" and "what range of attitudes towards the human body are shown in such texts?".) I do feel like this would have been even stronger had it included an account of how the modern concept of gnosticism had evolved. I also, of course, feel like I really shouldn't pronounce anything until reading the counter-arguments, but naturally I haven't taken the time to track them down...

Posted at May 31, 2016 23:59 | permanent link

## May 04, 2016

### "Nonparametric Estimation and Comparison for Networks" (Friday at U. Washington)

Attention conservation notice:: An academic promoting his own talk. Even if you can get past that, only of interest if you (1) care about statistical methods for comparing network data sets, and (2) will be in Seattle on Friday.

Since the coin came up heads, I ought to mention I'm giving a talk at the end of the week:

"Nonparametric Estimation and Comparison for Networks", UW-Seattle statistics dept. seminar
Abstract: Scientific questions about networks are often comparative: we want to know whether the difference between two networks is just noise, and, if not, how their structures differ. I'll describe a general framework for network comparison, based on testing whether the distance between models estimated from separate networks exceeds what we'd expect based on a pooled estimate. This framework is especially useful with nonparametric network models, such as densities of latent node locations, or continuous generalizations of block models ("graphons"); the estimation methods for those models also let us generate surrogate data, predict links, and summarize structure.
(Joint work with Dena Asta, Chris Genovese, Brian Karrer, Andrew Thomas, and Lawrence Wang.)
Time and place: 3:30--4:30 pm on Friday, 6 May 2016, in SMI 211, UW-Seattle

Posted at May 04, 2016 23:59 | permanent link

### "Partitioning a Large Simulation as It Runs" (Next Week at the Statistics Seminar)

Attention conservation notice: Only of interest if you (1) care about running large simulations which are actually good for something, and (2) will be in Pittsburgh on Tuesday.
Kary Myers, "Partitioning a Large Simulation as It Runs" (Technometrics forthcoming)
Abstract: As computer simulations continue to grow in size and complexity, they present a particularly challenging class of big data problems. Many application areas are moving toward exascale computing systems, systems that perform $10^{18}$ FLOPS (FLoating-point Operations Per Second) --- a billion billion calculations per second. Simulations at this scale can generate output that exceeds both the storage capacity and the bandwidth available for transfer to storage, making post-processing and analysis challenging. One approach is to embed some analyses in the simulation while the simulation is running --- a strategy often called in situ analysis --- to reduce the need for transfer to storage. Another strategy is to save only a reduced set of time steps rather than the full simulation. Typically the selected time steps are evenly spaced, where the spacing can be defined by the budget for storage and transfer. Our work combines both of these ideas to introduce an online in situ method for identifying a reduced set of time steps of the simulation to save. Our approach significantly reduces the data transfer and storage requirements, and it provides improved fidelity to the simulation to facilitate post-processing and reconsruction. We illustrate the method using a computer simulation that supported NASA's 2009 Lunar Crater Observation and Sensing Satellite mission.
Time and place: 4--5 pm on Tuesday, 10 May 2016, in Baker Hall 235B

As always, the talk is free and open to the public.

Posted at May 04, 2016 03:00 | permanent link

## April 30, 2016

### Books to Read While the Algae Grow in Your Fur, April 2016

Attention conservation notice: I have no taste.

Ruth Downie, Medicus; Terra Incognita; Persona Non Grata; Caveat Emptor; Semper Fidelis; Tabula Rasa
Mind candy: historical mysteries set in early 2nd century Roman Britain (and southern Gaul), following the mis-adventures of a Roman legionary doctor and his British wife. (Well, originally Tilla is his slave, but it's complicated.) They are, for me, absolute catnip, and the perfect thing to binge read while in the stage of recovering from food-poisoning where I can read but can't do anything more useful. (I also can't help thinking that they are exactly the sort of thing my grandmother would have loved.)
Kathleen George, A Measure of Blood
Yet another Pittsburgh-centric mystery, taking place largely in the mind of the murderer. Much of the action happens around the University of Pittsburgh, i.e., just down the street.
Mind candy, and not exactly recommended. Brunner was one of the great science fiction writers, the publishers of the ancient paperback edition I have played this up, and there is in fact a very light science-fictional angle to the story. But really it's a mystery novel which is very much a period piece of Swinging London. I enjoyed it, but I also found it funny in ways I doubt Brunner intended. For Brunner completists (in which case, this is, astonishingly, available electronically), or those seeking documents of the milieu.
Scott Hawkins, The Library at Mount Char
Strictly speaking, this is a contemporary fantasy novel set in exurban Virginia, where the main characters are American children who have been selected by a nigh-omniscient teacher to learn the mystic arts at the titular library. What raises it above the level of mind candy is the fact that such a description give you no idea whatsoever of how strange this story is, either in its content or in its narration. Hawkins is obviously showing off from the very first lines (which hooked me), and makes basically no concessions for weak readers. He also has a pitiless quality towards his characters which I, for one, found very agreeable. The only thing I can begin to compare it to is somebody reading Shadowland, and then saying "That was really good, but Peter Straub's imagination is just too nice and normal". Even that doesn't really convey how impressive a performance this is.
(Picked up on Kameron Hurley's recommendation.)
Jen Williams, The Copper Promise
Mind candy: old-school fantasy, clearly inspired by role-playing games (there are both dungeons, plural, and dragons), but very enjoyably written, delivering the pleasures of light-hearted adventure without being either morally obtuse or wallowing in self-satisfied grimdarkness. It's self-contained, but at least one sequel has come out in the UK already, and both will appear in the US within the year.
I forget where I saw this recommended, but whoever it was, thank you; and additional thanks to a surprisingly-good used English-language bookstore in Amsterdam last summer.
Eric Smith and Harold J. Morowitz, The Origin and Nature of Life on Earth: The Emergence of the Fourth Geosphere
To quote some know-it-all from the dust-jacket, "This is a truly unusual work of scholarship, which offers both novel perspectives on a huge range of disciplines and a model of scientific synthesis. This is a remarkable, and remarkably impressive, book." --- I will try to say more about this book in the coming month.
Disclaimer: Eric is one of the smartest people I've ever met, and, despite that, a friend.
Kelley Armstrong, Forest of Ruin
Mind candy fantasy: a satisfying conclusion to the series, but not quite as satisfying to me as if \SPOILER had not turned out so happily. (On the other hand, I really didn't see that particular twist coming.) Jack Campbell, The Pirates of Pacta Servanda Mind candy, continuing the story from previous volumes, and basically incomprehensible without them. In this installment, a group of ideological extremists our heroes establish a safe-haven in a failed state find refuge from the whole of the international community their enemies, running guns to support one warlord over another defending innocent civilians and the last remnants of a traditional monarchy. Catherine Wilson, Epicureanism at the Origins of Modernity A gracefully written survey of Epicurean themes in philosophy and science, and to a lesser extent general literary culture, during the 17th century — as in Bacon, Boyle, Hobbes, Locke, Descartes, Spinoza, various erudite libertines, etc. Wilson considers physical, moral and meta-physical ideas, all at a very qualitative level. (E.g., she says relatively little --- though not nothing --- about the increasing role of mathematics in 17th century physical speculations, which from my perspective is one of the biggest differences between ancient atomism and its early-modern descendant.) Very appropriately, she also covers anti-Epicurean reactions, like that of Leibniz, including discussing what they owed to their opponents. The organization is thematic rather than chronological, but the themes are themselves fairly logically arranged. It definitely presumes a broad familiarity with 17th century thought, but not much knowledge of Epicureanism, and it's very skillfully presented. This is the first book of Wilson's I've read, but lots of her stuff looks interesting and I will certainly be tracking down more. Dream Street: W. Eugene Smith's Pittsburgh Project Beautiful, beautiful photographs of the city from 1955--1957. (Many but not all of them can be seen online through Magnum.) The composition and selection are both incredible. Smith was evidently a real piece of work, but still the story of a multi-year, career-wrecking obsession with capturing the whole of the life of a city feels, except for the technology, as though it were ripped straight from the Romantic period. (My neighborhood seems to have changed remarkably little in its character over the last sixty years.) Patrick Manning, Slavery and African Life: Occidental, Oriental, and African Slave Trades A short but compendious history of the African slave trades --- to the Americas and other European colonies, to north Africa, southwest and south Asia ("oriental"), and within Africa --- their place in world history, their impact on African societies, and their all-too-gradual dissolution. An intriguing feature is the use of a demographic simulation --- what I'd call a "compartmental model" --- to estimate the historical sizes of the populations from which slaves were drawn, and so the impact of the slave trade on population growth and sex ratios within Africa. It would be very interesting to re-do the estimation here. (Thanks to Prof. Manning for lending me a copy of his book.) Tony Cliff, Delilah Dirk and the King's Shilling Comic book mind candy, in which Miss Dirk and Mister Selim find themselves compelled to go to England, and mayhem and social sniping ensue. (Previously) Marie Brennan, In the Labyrinth of Drakes Mind candy, enjoyable fantasy of 19th century natural history and archaeology division. (Previously.) N. K. Jemisin, The Fifth Season Epic fantasy, but I think it rises above the level of mind candy. The approach to story-telling starts out by looking like bog-standard epic fantasy, if well done, but then gets more complicated and interesting (in spoilerish ways). Even better is the world-building: a planet where plate tectonics is so active that the dominant ideology is that the Earth is our father, and he hates us. The "Fifth Season" of the title are the irregular geological disasters which make the only known continent nearly uninhabitable; their depiction is at once chilling and clearly a labor of love. (If it is wrong to be charmed by the range and depths of her catastrophes, then I don't want to be right.) Because this is a fantasy novel, there is also a minority group which has the useful ability of being able to quell these disasters. (Jemisin, characteristically, has thought about the thermodynamics.) They are simultaneously valued for their abilities and despised for their different-ness, with a range of plausible racial stereotypes, more or less internalized by the enslaved members of the group. Because Jemisin is a good novelist, none of this maps exactly on to any real-world minority. There sequel is coming later this year, and can hardly arrive too soon. Posted at April 30, 2016 23:59 | permanent link ## April 20, 2016 ### In memoriam Prita Shireen Kumarappa Shalizi Posted at April 20, 2016 09:50 | permanent link ## April 19, 2016 ### Course Announcements: Statistical Network Models, Fall 2016 Attention conservation notice: Self-promotion, and irrelevant unless you (1) will be a student at Carnegie Mellon in the fall, or (2) have a morbid curiosity about a field in which the realities of social life are first caricatured into an impoverished formalism of dots and lines, devoid even of visual interest and incapable of distinguishing the real process of making movies from a mere sketch of the nervous system of a worm, and then further and further abstracted into more and more recondite stochastic models, all expounded by someone who has never himself taken a class in either social science or any of the relevant mathematics. Two, new, half-semester courses for the fall: 36-720, Statistical Network Models 6 units, mini-semester 1; Mondays and Wednesdays 3:00--4:20 pm, Baker Hall 235A This course is a rapid introduction to the statistical modeling of social, biological and technological networks. Emphasis will be on statistical methodology and subject-matter-agnostic models, rather than on the specifics of different application areas. There are no formal pre-requisites, and no prior experience with networks is expected, but familiarity with statistical modeling is essential. Topics (subject to revision): basic graph theory; data collection and sampling; random graphs; block models and community discovery; latent space models; "small world" and preferential attachment models; exponential-family random graph models; visualization; model validation; dynamic processes on networks. 36-781, Advanced Network Modeling 6 units, mini-semester 2; Tuesdays and Thursdays 1:30--2:50 pm, Wean Hall 5312 Recent work on infinite-dimensional models of networks is based on the related notions of graph limits and of decomposing symmetric network models into mixtures of simpler ones. This course aims to bring students with a working knowledge of network modeling close to the research frontier. Students will be expected to complete projects which could be original research or literature reviews. There are no formal pre-requisites, but the intended audience consists of students who are already familiar with networks, with statistical modeling, and with advanced probability. Others may find it possible to keep up, but you do so at your own risk. Topics (subject to revision): exchangeable networks; the Aldous-Hoover representation theorem for exchangeable network models; limits of dense graph sequences ("graphons"); connection to stochastic block models; non-parametric estimation and comparison; approaches to sparse graphs. 720 is targeted at first-year graduate students in statistics and related fields, but is open to everyone, even well-prepared undergrads. Those more familiar with social networks who want to learn about modeling are also welcome, but should probably check with me first. 781 is deliberately going to demand rather more mathematical maturity. Auditors are welcome in both classes. Posted at April 19, 2016 16:00 | permanent link ## April 15, 2016 ### "Network Comparisons Using Sample Splitting" My fifth Ph.D. student is defending his thesis towards the end of the month: Lawrence Wang, Network Comparisons Using Sample Splitting Abstract: Many scientific questions about networks are actually network comparison problems: Could two networks have reasonably come from a common source? Are there specific differences? We outline a procedure that tests the hypothesis that multiple networks were drawn from the same probabilistic source. In addition, when the networks are indeed different, our procedure may characterize the differences between the sources. We first address the case where the two networks being compared share the same exact nodes. We wish to use common parametric network models and the standard likelihood ratio test (LRT), but the infeasibility of computing the maximum likelihood estimate in our selected families of models complicates matters. However, we take advantage of the fact that the standard likelihood ratio test has a simple asymptotic distribution under a specific restriction of the model family. In addition, we show that a sample splitting approach is applicable: We can use part of the network data to choose an appropriate model space, and use the remaining network data to compute the LRT statistic and appeal to its asymptotic null distribution to obtain an appropriate p-value. Moreover, we show that while a single sample split results in a random p-value, we can choose to do multiple sample splits and aggregate the resulting individual p-values. Sample splitting is a more general framework --- nothing is particularly special about the specific hypothesis we decide to test. We illustrate a couple of extensions of the framework which also provide different ways to characterize differences in network models. We also address the more general case where the two networks being compared no longer share the same set of nodes. The main difficulty in this case is that there might not be an implicit alignment of the nodes in the two networks. Our procedure relies on the graphon model family which can handle networks of any size, but more importantly can be put in an aligned form which makes it comparable. We show that the framework for alignment can be generalized, which allows this method to handle a larger class of models. Time and place: 3:30 pm on Monday, 25 April 2016 in Porter Hall A22 Posted at April 15, 2016 12:00 | permanent link ## April 01, 2016 ### You Think This Is Bad Attention conservation notice: Note the date. Any intelligent and well-intentioned person should have a huge, even over-riding preference for leaving existing social and political institutions and hierarchies alone, just because they are the existing ones. Obviously this can't rest on any presumption that existing institutions are very good, or very wise, or embody any particularly precious values, or are even morally indifferent. They are not. It would also be stupid to appeal to some sub-Darwinian notion that our institutions, just because they have come down to us, and so must have survived an extensive process of selection, are therefore adaptive. At best, that would show the institutions were good at reproducing themselves from generation to generation, not that they had any human or ethical merit. In any case the transmission of any tradition by human beings is inevitably partial and re-interpretive, and so we have no reason to defer to tradition as such. Stare decisis conservatism rests instead on much less cosy grounds: However awful things are now, they could always be worse, and humanity is both too dumb to avoid making things worse, and too mean to want to avoid making things worse even when it could. The point about stupidity is elemental. If someone complains that an existing institution is unjust (or unfair, oppressive, etc.), their complaint only has force if a more just alternative is possible. (Otherwise, take it up with the Management.) But it only has political force if that more just alternative is not only possible, but we can figure out what it is. This, we a signally unsuited to do. Social science can tell us many interesting things, but on the most crucial questions of "What will happen if we do this?", we get either dogmatic, experimentally-falsified ideology (economics), or everything-is-obvious-once-you-know-the-answers just-so myths (every other branch of social science). "Try it, and see what happens" is the outer limit of social-scientific wisdom. This is no basis on which to erect a reliable social engineering, or even social handicrafts. When we try to deliberately change our institutions, we are, at best, guided by visions, endemic and epidemic superstitions, evidence-based haruspicy, and the academic version of looking at a list of random words and declaring they all relate to motel service. We have no basis to think that our reforms, if we can even implement them, will rectify the injustice that first aroused our ire, our pity, or our ambition, much less that the attempt won't create even worse problems. Even getting our pet reform implemented is often going to be hopeless, because so much of our collective knowledge about how to get things done, socially, is tacit. That knowledge is not anything which its holders can put into words, or into a computer, much less into a schedule of prices, but is rather buried in their habits and inarticulate skills. Often these are the habits and skills of a very small number of crucially-placed people, who are, not so coincidentally, vested in the existing institutions and complicit in the existing injustices. Even more, these are habits and skills which only work in a particular environment, usually a social environment. The same people, asked to make a modified institution work, will be less effective, even hopeless. Throwing the bums out gets rid of the people who knew how to get things done. Finally, and most crucially, think about what happens when existing institutions and arrangements are disturbed. Social life is always full of a clash of conflicting interests. (One of the few things the economists have right is that inside every positive-sum interaction, there is a negative-sum struggle over who gets the gains from cooperation.) When an institution seems settled, eternal, it fades from view, nobody fights over it. Its harsher lines may be softened by compassion (and condescension) on the side of those it advantages, or local and unofficial accommodations and arrangements, or even just from it being too much trouble to exploit it to the hilt. But question the institution, disturb it, make it obvious that there is something to fight over, and what happens? Those who gain from the injustice won't give it up merely because that would be right. Instead, they will press to keep what they have --- and even to claim more. Since this has become an open conflict of power, what emerges is not going to favor the lowly, poor and the weak. Or if that area of social life should, for a time, descend into chaos, well, the tyranny of structurelessness is real, and those who benefit from it are, again, those who are already advantaged, and willing to exploit those advantages. Things might be very different if people were able to agree on justice, and willing to follow it, but they are not. To recapitulate: People are foolish, selfish and cruel. This means that our institutions are always grossly unjust. But it also means that we don't know how to really make things better. It further means that trying to change anything turns it into a battlefield, where nothing good happens to anybody, least of all the weak and oppressed. Since our current institutions are at least survivable (proof: we've survived them), it's better to leave them alone. They'll change anyway, and that will cause enough grief, without deliberately courting more by ignorant meddling. Of course, people who actually defend inherited institutions and arrangements just because they're inherited, — such people can usually be counted on the fingers of one fist. Corey Robin would argue — and he has a case — that the impulse behind most actually-existing conservatism is a positive liking for hierarchy. This was an attempt at trying to construct a case for conservatism which would employ all three of Hirschman's tropes of reactionary rhetoric, but also wouldn't fall apart at the first skeptical prod. (Readers who point me at Hayek will be ignored; readers who point me at "neo-reactionaries" will be mocked.) What I have written is still an assembly of fallacies, half-truths and hyperboles, but I flatter myself it would still stand a little inspection. Posted at April 01, 2016 00:01 | permanent link ## March 31, 2016 ### Books to Read While the Algae Grow in Your Fur, March 2016 Attention conservation notice: I have no taste. Guido W. Imbens and Donald B. Rubin, Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction While I found less to disagree with about the over-all approach than I anticipated, I am genuinely surprised (not "shocked, shocked!" surprised) to find so much sloppiness in the mere data analysis. I can't recommend this book to anyone who isn't already well-trained in applied statistics. To say any more here would preempt my review for JASA, so I'll just link to that when it's out. — I will however mention one grumble, which didn't fit in the review. From p. 174: The possible advantage of the frequentist approach [over the Bayesian] is that it avoids the need to specify the prior distribution p(\theta) $for the parameters governing the joint distribution of the two potential outcomes. However, this does not come without cost. Nearly always one has to rely on large sample approximations to justify the derived frequentist confidence intervals. But in large samples, by the Bernstein-Von Mises Theorem (e.g., Van Der Vaart, 1998), the practical implications of the choice of prior distribution is limited, and the alleged benefits of the frequentist approach vanish. I don't see how to unpack everything objectionable in these few sentences without rehearsing the whole of this post, and adding "the bootstrap is a thing, you know". Tarquin Hall, The Case of the Love Commandos Mind candy: the latest in the mystery series, though enjoyable independently; this time, we find Vish Puri unwillingly drawn into the nexus of caste and politics in rural Uttar Pradesh. Jack Campbell, The Dragons of Dorcastle, The Hidden Masters of Marandur, The Assassins of Altis Mind candy science fantasy. There are some thematic similarities to Rosemary Kirstein's (much superior) Steerswomen books. Those themes are, as it were, here transcribed into the key of Teen's Own Adventures (Campbell gets points for having the Heroic Engineer with a Destiny be a young woman), with less compelling world-building than Kirstein. Still, I zoomed through these and await the sequels. ROT-13'd for spoilers: Bar jnl va juvpu Xvefgrva'f obbxf ner fhcrevbe vf gung ure cebgntbavfgf unir gb npghnyyl svther bhg gur uvqqra gehguf bs gurve jbeyq, jurernf Pnzcoryy gnxrf gur ynml snagnfl-jevgre jnl bhg bs univat gurer or uvqqra fntrf jub pna whfg gryy gur urebrf rirelguvat. Nyfb, V nz abg fher V unir rire frra "orpnhfr bs dhnaghz!" hfrq fb funzryrffyl ol nal jevgre jub jnfa'g n zrqvpny dhnpx. Paul McAuley, Into Everywhere Further into the future of his (excellent) Something Coming Through, in which finding that we are only the latest in a galaxy full of the remains of much older, much more powerful, and much weirder alien civilizations is not very good for humanity. For instance, the scientific method seems to atrophy as we move up the time-line, in much the way Chomsky fears will result from cheap computing [*]. There is a reason for this. ROT-13'd for spoilers: Gur eriryngvba ng gur raq, gung gur gehr nvz bs nyy guvf nyvra zrqqyvat vf abg gb qb fbzrguvat gb uhznavgl ohg gb trg hf gb cebqhpr NVf, orpnhfr gur shgher bs nal vagryyvtrag yvarntr vf hygvzngryl znpuvarf, vf bs pbhefr fgenvtug bhg bs Pynexr'f 2001. Guvf yrnqf zr gb jbaqre jurgure gurfr abiry'f nera'g ZpNhyrl va qvnybthr jvgu Pynexr, rfcrpvnyyl jvgu 2001 rg frd. naq gur Guveq Ynj, va zhpu gur jnl gung, fnl, Pbasyhrapr jnf ZpNhyrl va qvnybthr jvgu Jbysr naq gur Obbx bs gur Arj Fha. *: From Chapter 59, "Synchronicity": They didn't appear to use any kind of analytical reasoning to confirm their conjectures, employing instead a crude form of experimental Darwinism, seeding a matrix with algorithms modelling variations of their initial assumptions and letting them run to a halting state, selecting those that most resembled the observed conditions, and running and re-running everything over and over again until they had derived an algorithm that reproduced reality to an agreed level of statistical confidence. The wizards didn't care that this method gave no insights into the problems it attacked, or that they didn't understand how the solutions it yielded were related to the vast edifice of Euclidean mathematical theory. They weren't interested in theory. As far as they were concerned, if an algorithm gave the right answer, then plug it in: it was good to go. Sven Beckert, Empire of Cotton This is a really good global history of the development of the world's cotton industry from the opening of trans-Atlantic navigation down through about 1950. (An epilogue considers later events, but very cursorily.) The central incident is of course the industrial revolution that began in England in the late 18th century, which could only attain the scale it did because there were other parts of the world, notably the Americas, which could supply cotton on the requisite industrial scale; they did so through slavery. After abolition, the Americas also provided the pattern for making sure formally-free rural cultivators produced cotton for the market, rather than farming for subsistence, a pattern eagerly and often explicitly copied by imperial powers across the globe. Cotton was not just the first truly modern industry, it was for a long time the most important, and is arguably still one of the most important on a global scale, and so its story is, in large part, the story of how we got here. I have, as a supremely unqualified but opinionated non-historian, some quibbles. Stylistically, he over-uses pet phrases like "the empire of cotton" and "the white gold", and keeps reminding readers that they are probably wearing cotton. Analytically, and more seriously, Beckert makes much more of this world-wide division of labor than of machinery, which is a mistake. Industrialism within one country (say, the American south) would have been quite feasible; a worldwide capitalism limited to animal power and manual labor would be at best a flexible and adaptive poverty. His account of the decline of cotton manufacturing in Europe and North America in the 20th century refers only to the difference in wages between those countries and places like China or India, ignoring differences in productivity. On a different plane, this is possibly the only genuinely crypto-Marxist book to ever win the Bancroft prize. The over-lap in themes just with Capital is very striking: the violence of capitalist primitive accumulation, the division of labor on a world scale, the struggle over the working day in the Lancashire mills, the deep importance attached to the American Civil War, the praise of capitalism for developing the productive forces to the point where something better becomes feasible and necessary. And also some post-Marx Marxist themes: late 19th century imperialism as driven by rivalry among capitalists, an autonomous role for the state (as something more than just an executive committee for managing the affairs of the bourgeoise, though it is that too), very odd statements about the Soviet Union and Maoist China. That Marx is mentioned only once, and that in passing, is surely no coincidence. Emily Horne and Joey Comeau, Anatomy of Melancholy: The Best of A Softer World Calling A Softer World one of the best web-comics gives no idea whatsoever of its merits. I was deeply saddened to learn it would end in 2015, and only partially consoled by the prospect of this book. I commend it to anyone who reads this blog with pleasure. Elliott Kay, Dead Man's Debt Mind candy: sequel to Poor Man's War and Rich Man's Fight, bringing the series to a satisfying stopping point. Probably not enjoyable without the previous books. Salla Simukka, As Red as Blood, As White as Snow, As Black as Ebony Mind candy, aptly described by James Davis Nicoll as what would happen if a plucky girl detective like Nancy Drew wandered into a Kurt Wallander novel?" Hilary Mantel, Bring Up the Bodies Mind candy: Further literary, historical competence porn. Christopher Hayes, Twilight of the Elites First, go read the review/precis by Aaron Swartz (peace be upon him). I started this four years ago, then set it aside when I got busy, and only now took it back up (for obvious reasons) and finished it. Part of me reads this going "preach, brother, preach!" In particular, the Iron Law of Meritocracy seems like a real contribution. (Pedantically, priority goes to James Flynn, though.) As a product of the meritocracy, whose parents are also products of the meritocracy, and who makes his living teaching at an elite school, this is not happy news, but there we are. Other parts would like to see more: allowing that we have a self-serving and dysfunctional elite now, were previous elites really any more functional or less self-serving? This is hardly an obvious point or one Hayes establishes. Generally, Hayes seems strongest when he's documenting the ways things are bad now, but he also needs to say that they're worse than before, or at are bad in new ways, and that's lacking. [*] Of course this amounts to asking that he have written a different and much more academic book. As something at the border between a political tract and popular social science by a working journalist, it's astonishingly good. *: Reading the bits about how the country feels like it's falling apart, I couldn't help thinking of the incredible opening to Joan Didion's "Slouching Towards Bethlehem": The center was not holding. It was a country of bankruptcy notices and public-auction announcements and commonplace reports of causal killings and misplaced children and abandoned homes and vandals who misspelled even the four-letter words they scrawled. It was a country in which families routinely disappeared, trailing bad checks and repossession papers. Adolescents drifted from city to torn city, sloughing off both the past and the future as snakes shed their skins, children who were never taught and would never now learn the games that had held the society together. People were missing. Children were missing. Parents were missing. Those left behind filed desultory missing-persons reports, then moved on themselves. Of course, Didion goes on: It was not a country in open revolution. It was not a country under enemy siege. It was the United States of America in the cold late spring of 1967, and the market was steady and the G.N.P. high and a great many articulate people seemed to have a sense of high social purpose and it might have been a spring of brave hopes and national promise, but it was not, and more and more people had the uneasy apprehension that it was not. All that seemed clear was that at some point we had aborted ourselves and butchered the job... Seanan McGuire, Indexing: Reflections Mind candy: more contemporary fantasy about fairy tales trying to escape from the dungeon dimensions and/or collective unconscious. Posted at March 31, 2016 23:59 | permanent link ## March 19, 2016 ### "Reassembling the History of the Novel" Attention conservation notice: Only of interest if you (1) care about the quantitative history of English novels, and (2) will be in Pittsburgh at the end of the month. I had nothing to do with making this happen — Scott Weingart did — but when the seminar gods offer me something this relevant to my interests, it behooves me to promote it: Allen Riddell, "Reassembling the History of the Novel" Abstract: How might the 19th century novel be studied and taught if all (surviving) novels were readily available to students and researchers? While many have lamented the fact that literary historians tend to ignore works outside the "canonical fraction" of the ~25,000 novels published in the British Isles during the 19th century, there have been few concrete proposals addressing the question of how surviving novels might productively enter research and teaching and participate in our thinking about the nexus of literature and society. This presentation describes the prospects for a data-intensive and sociologically-inclined history of the novel focused on the population of published novels, the novels' writers, and the writers' penumbra. (A group's penumbra is the set of individuals acquainted with members of the group.) Marshalling evidence from a range of sources and aided by probabilistic models of text data, I will demonstrate how this approach yields insights into two significant developments in the history of the English novel: (1) the rapid influx of male writers after 1815, and (2) the dramatic increase in the rate of publication of novels after 1830. The presentation also features a discussion of Franco Moretti's call, echoing Karl Popper, that literary historians should advance risky---and, in some cases, "testable"---hypotheses. Time and place: 4:30--5:30 pm on Wednesday, 30 March 2016 in Studio A, Hunt Library (first floor) As always, the talk is free and open to the public. Posted at March 19, 2016 20:24 | permanent link ## February 29, 2016 ### Books to Read While the Algae Grow in Your Fur, February 2016 Attention conservation notice: I have no taste. Douglas A. Blackmon, Slavery by Another Name: The Re-Enslavement of Black Americans from the Civil War to World War II The story told here is just as appalling as the sub-title promises. Blackmon focuses on Alabama, but makes it clear that stuff like this happened all over the South. Since this is popular rather than professional history, there is a bit more of you-are-there detail than I am completely comfortable with, and I wish there had been more about things like the Great Migration and the impact of agricultural mechanization. But it's still very well written, and the story it tells deserves to be much better known. Two comparatively minor points: 1. There were actually cases under Theodore Roosevelt of white men in the south being brought to court for holding black men as slaves. (The legal defense was that while amendments to the Constitution had banned slavery, there were no actual laws against it, so no crime.) This has all the elements which a big strand of our popular mythology looks for: a courtroom drama in which a fearless prosecutor and dedicated investigators, with the support of a reforming president, uncover a vast criminal enterprise, persuade reluctant witnesses to testify, bring the case before the public eye and an honest judge — and the whole thing failed to do the slightest bit of good. I think Blackmon has to be aware of how this part of his narrative fits with these motifs, but fails to have the expected ending; it's probably all the more effective for his not being explicit about it. 2. It is probably irrational to feel more of a shameful connection to these injustices because U.S. Steel (and so Andrew Carnegie, and so Carnegie Tech) was one of the beneficiaries Blackmon highlights, but I do. Matt Ruff, Lovecraft Country Victor LaValle, The Ballad of Black Tom Mind candy. Ruff's book follows the mis-adventures of an African-American family of science fiction fans in 1950s Chicago, in a world where it's not clear whether eldritch abominations or ordinary life is more soul-destroying. It's a bit episodic, but still well done. LaValle's novella is a re-imagining of one of Lovecraft's most racist stories, "The Horror at Red Hook", from the perspective of a black Harlemite who would have been, at best, a nameless minion in the original. It's an interesting choice of a work to re-imagine, because even drawing a veil over the bigotry, it's not one of Lovecraft's better stories. Why respond to an ugly piece of bad fiction from almost a century ago? The only good reason is that there is, underneath all the purple prose and the all-too-transparent fears, something of real imaginative power and value in Lovecraft's work, and that value should be even to those whom he cast as monsters. The fact that LaValle is much better at cosmic horror than Lovecraft was in "Red Hook" is just icing on the cake. If this is intriguing, it's worth reading LaValle and Ruff in conversation. (Previously for Ruff; previously for LaValle.) Jo Walton, The Just City This is, obviously, exactly what would happen if Athena and Apollo conspired to realize The Republic with a population of time-traveling Platonists, 10,800 child slaves bought in antiquity, and robots. Exactly what would happen, down to Socrates trolling everyone so hard that, well --- read it. Genre note: I thought the chapters from Simmea's viewpoint did a very good job of both sounding plausible, and playing off the now-well-worn conventions of young adult dystopias. Because, of course, from a certain angle that's what the the Republic would be. (Shoved to the top of the pile by the outstanding Crooked Timber symposium on this book and its sequel [which is on its way to me].) Robert Jackson Bennett, City of Blades Mind-candy fantasy, sequel to City of Stairs, continuing the story of how the first technological power in a fantasy world deals with the consequences of having killed all the gods. It is as awesome as its predecessor, though I should perhaps say that Bennett is quite prepared to deal brutally with sympathetic characters. (There was a moment near the end where I thought he was going to reprise the cyclical metaphysics of Mr. Shivers, but fortunately I was wrong.) J. H. Conway, Regular Algebra and Finite Machines I liked the first half or so. In particular, the notion of the derivative of one regular event with respect to another is neat in itself, and the corresponding Taylor series gives a very direct way of translating a regular expression into a finite machine. But then Conway zoomed off into the algebraic stratosphere, and if there was any tether connecting him back to actual problems with formal languages or automata, I completely lost track of it, and didn't see the point. (This is formally self-contained as far as automata and language theory goes, but definitely presumes a strong grasp of abstract algebra. Its full appreciation also evidently presumes more mathematical maturity than I possess.) Posted at February 29, 2016 23:59 | permanent link ## February 25, 2016 ### "Analyzing large-scale data: Taxi Tipping behavior in NYC" (This Week at the Statistics Seminar) Attention conservation notice: Only of interest if you (1) care about large-scale data analysis and/or taxis, and (2) will be in Pittsburgh on Thursday Friday. The last but by no means least talk seminar talk this week: Taylor Arnold, "Analyzing large-scale data: Taxi Tipping behavior in NYC" Abstract: Statisticians are increasingly tasked with providing insights from large streaming data sources, which can quickly grow to be terabytes or petabytes in size. In this talk, I explore novel approaches for applying classical and emerging techniques to large-scale datasets. Specifically, I discuss methodologies for expressing estimators in terms of the (weighted) Gramian matrix and other easily distributed summary statistics. I then present an abstraction layer for implementing chunk-wise algorithms that are interoperable over many parallel and distributed software frameworks. The utility and insights garnered from these methods are shown through an application to an event based dataset provided by the New York City Taxi and Limousine Commission. I have joined these observations, which detail every registered taxicab trip from 2009 to the present, with external sources such as weather conditions and demographics. I use the aforementioned techniques to explore factors associated with taxi demand and the tipping behavior of riders. My focus is on developing novel techniques to facilitate interactive exploratory data analysis and to construct interpretable models at scale. Time and place: 4:30--5:30 pm on Thursday, 25 February 2016, in Baker Hall A51 4:30--5:30 pm on Friday, 26 February 2016, in Baker Hall A51 As always, the talk is free and open to the public. Update: Dr. Arnold's talk has been pushed back a day due to travel delays. Posted at February 25, 2016 11:16 | permanent link ### Denying the Service of a Differentially Private Database Attention conservation notice: A half-clever dig at one of the more serious and constructive attempts to do something about an important problem that won't go away on its own. It doesn't even explain the idea it tries to undermine. Jerzy's "cursory overview of differential privacy" post brings back to mind an idea which I doubt is original, but whose source I can't remember. (It's not Baumbauer et al.'s "Fool's Gold: an Illustrated Critique of Differential Privacy" [ssrn/2326746], though they do make a related point about multiple queries.) The point of differential privacy is to guarantee that adding or removing any one person from the data base can't change the likelihood function by more than a certain factor; that the log-likelihood remains within$\pm \epsilon$. This is achieved by adding noise with a Laplace (double-exponential) distribution to the output of any query from the data base, with the magnitude of the noise being inversely related to the required bound$\epsilon$. (Tighter privacy bounds require more noise.) The tricky bit is that these$\epsilon$s are additive across queries. If the$i^{\mathrm{th}}$query can change the log-likelihood by up to$\pm \epsilon_i$, a series of queries can change the log-likelihood by up to$\sum_{i}{\epsilon_i}$. If the data-base owner allows a constant$\epsilon$per query, we can then break the privacy by making lots of queries. Conversely, if the$\epsilon$per query is not to be too tight, we can only allow a small number of constant-$\epsilon$queries. A final option is to gradually ramp down the$\epsilon_i$so that their sum remains finite, e.g.,$\epsilon_i \propto i^{-2}$. This would mean that early queries were subject to little distortion, but latter ones were more and more noisy. One side effect of any of these schemes, which is what I want to bring out, is that they offer a way to make the database unusable, or nearly unusable, for everyone else. I make the queries I want (if any), and then flood the server with random, pointless queries about the number of cars driven by left-handed dentists in Albuquerque (or whatever). Either the server has a fixed$\epsilon$per query, and so a fixed upper limit on the number of queries, or$\epsilon\$ grows after each query. In the first case, the server has to stop answering others' queries; in the second, eventually they get only noise. Or --- more plausibly --- whoever runs the server has to abandon their differential privacy guarantee.

This same attack would also work, by the way, against the "re-usable holdout". That paper (not surprisingly, given the authors) is basically about creating a testing set, and then answering predictive models' queries about it while guaranteeing differential privacy. To keep the distortion from blowing up, only a limited number of queries can be asked of the testing-set server. That is, the server is explicitly allowed to return NA, rather than a proper answer, and it will always do so after enough questions. In the situation they imagine, though, of the server being a "leaderboard" in a competition among models, the simple way to win is to put in a model early (even a decent model, for form's sake), and then keep putting trivial variants of it in, as often as possible, as quickly as possible. This is because each time I submit a model, I deprive all my possible opponents of one use of the testing set, and if I'm fast enough I can keep them from ever having their models tested at all.

Posted at February 25, 2016 11:09 | permanent link

## February 24, 2016

### On the Ethics of Expert Advocacy Consulting

Attention conservation notice: A ponderous elaboration of an acerbic line by Upton Sinclair. Written so long ago I've honestly forgotten what incident provoked it, then left to gather dust and re-discovered by accident.

A common defense of experts consulting for sometimes nefarious characters in legal cases is that the money isn't corrupting, if the expert happens to agree with the position anyway already. So, for instance, if someone with relevant expertise has doubts about the link between cigarette smoking and cancer, or between fossil-fuel burning and global warming, what harm does it do if they accept money from Philip Morris or Exxon, to defray advocating this? By assumption, they're not lying about their expert opinion.

The problem with this excuse is that it pretends people never change their ideas. When we deal with each other as more-or-less honest people — when we treat what others say as communications rather than as manipulations — we do assume that those we're listening to are telling us things more-or-less as they see them. But we are also assuming that if the way they saw things changed, what they said would track that change. If they encountered new evidence, or even just new arguments, they would respond to them, they would evaluate them, and if they found them persuasive, they would not only change their minds, they would admit that they had done so. (Cf.) We know that can be galling for anyone to admit that they were wrong, but that's part of what we're asking for when we trust experts.

And now the problem with the on-going paid advocacy relationship becomes obvious. It adds material injury to emotional insult as a reason not to admit that one has changed one's mind. The human animal being what it is, this becomes a reason not to change one's mind --- to ignore, or to explain away, new evidence and new argument.

Sometimes the new evidence is ambiguous, the new argument has real weaknesses, and then this desire not to be persuaded by it can perform a real intellectual function, with each side sharpening each other. (You could call this "the cunning of reason" if you wanted to be really pretentious.) But how is the non-expert to know whether your objections are really sound, or whether you are desperately BS-ing to preserve your retainer? Maybe they could figure it out, with a lot of work, but they would be right to be suspicious.

Posted at February 24, 2016 00:26 | permanent link

## February 21, 2016

### On the Uncertainty of the Bayesian Estimator

Attention conservation notice: A failed attempt at a dialogue, combining the philosophical sophistication and easy approachability of statistical theory with the mathematical precision and practical application of epistemology, dragged out for 2500+ words (and equations). You have better things to do than read me vent about manuscripts I volunteered to referee.

Scene: We ascend, by a dubious staircase, to the garret loft space of Confectioner-Stevedore Hall, at Robberbaron-Bloodmoney University, where we find two fragments of the author's consciousness, temporarily incarnated as perpetual post-docs from the Department of Statistical Data Science, sharing an unheated office.

Q: Are you unhappy with the manuscript you're reviewing?

A: Yes, but I don't see why you care.

Q: The stabbing motions of your pen are both ostentatious and distracting. If I listen to you rant about it, will you go back to working without the semaphore?

A: I think that just means you find it easier to ignore my words than anything else, but I'm willing to try.

Q: So, what is getting you worked up about the manuscript?

A: They take a perfectly reasonable — though not obviously appropriate-to-the-problem — regularized estimator, and then go through immense effort to Bayesify it. They end up with about seven levels of hierarchical priors. Simple Metropolis-Hastings Monte Carlo would move as slowly as a continental plate, so they put vast efforts into speeding it up, and in a real technical triumph they get something which moves like a glacier.

Q: Isn't that rather fast these days?

A: If they try to scale up, my back-of-the-envelope calculation suggests they really will enter the regime where each data set will take a single Ph.D. thesis to analyze.

Q: So do you think that they're just masochists who're into frequentist pursuit, or do they have some reason for doing all these things that annoy you?

A: Their fondness for tables over figures does give me pause, but no, they claim to have a point. If they do all this work, they say, they can use their posterior distributions to quantify uncertainty in their estimates.

Q: That sounds like something statisticians should want to do. Haven't you been very pious about just that, about how handling uncertainty is what really sets statistics apart from other traditions of data analysis? Haven't I heard you say to students that they don't know anything until they know where the error bars go?

A: I suppose I have, though I don't recall that exact phrase. It's not the goal I object to, it's the way quantification of uncertainty is supposed to follow automatically from using Bayesian updating.

Q: You have to admit, the whole "posterior probability distribution over parameter values" thing certainly looks like a way of expressing uncertainty in quantitative form. In fact, last time we went around about this, didn't you admit that Bayesian agents are uncertain about parameters, though not about the probabilities of observable events?

A: I did, and they are, though that's very different from agreeing that they quantify uncertainty in any useful way — that they handle uncertainty well.

Q: Fine, I'll play the straight man and offer a concrete proposal for you to poke holes in. Shall we keep it simple and just consider parametric inference?

A: By all means.

Q: Alright, then, I start with some prior probability distribution over a finite-dimensional vector-valued parameter $\theta$, say with density $\pi(\theta)$. I observe $x$ and have a model which gives me the likelihood $L(\theta) = p(x;\theta)$, and then my posterior distribution is fixed by $\pi(\theta|X=x) \propto L(\theta) \pi(\theta)$ This is my measure-valued estimate. If I want a set-valued estimate of $\theta$, I can fix a level $\alpha$ and chose a region $C_{\alpha}$ with $\int_{C_{\alpha}}{\pi(\theta|X=x) d\theta} = \alpha$ Perhaps I even preferentially grow $C_{\alpha}$ around the posterior mode, or something like that, so it looks pretty. How is $C_{\alpha}$ not a reasonable way of quantifying my uncertainty about $\theta$?

A: To begin with, I don't know the probability that the true $\theta \in C_{\alpha}$.

Q: How is it not $\alpha$, like it says right there on the label?

A: Again, I don't understand what that means.

Q: Are you attacking subjective probability? Is that where this is going? OK: sometimes, when a Bayesian agent and a bookmaker love each other very much, the bookie will offer the Bayesian bets on whether $\theta \in C_{\alpha}$, and the agent will be indifferent so long as the odds are $\alpha : 1-\alpha$. And even if the bookie is really a damn dirty Dutch gold-digger, the agent can't be pumped dry of money. What part of this do you not understand?

A: I hardly know where to begin. I will leave aside the color commentary. I will leave aside the internal issues with Dutch book arguments for conditionalization. I will not pursue the fascinating, even revealing idea that something which is supposedly a universal requirement of rationality needs such very historically-specific institutions and ideas as money and making book and betting odds for its expression. The important thing is that you're telling me that $\alpha$, the level of credibility or confidence, is really about your betting odds.

Q: Yes, and?

A: I do not see why should I care about the odds at which you might bet. It's even worse than that, actually, I do not see why I should care about the odds at which a machine you programmed with the saddle-blanket prior (or, if we were doing nonparametrics, an Afghan jirga process prior) would bet. I fail to see how those odds help me learn anything about the world, or even reasonably-warranted uncertainties in inferences about the world.

Q: May I indulge in mythology for a moment?

A: Keep it clean, students may come by.

Q: That leaves out all the best myths, but very well. Each morning, when woken by rosy-fingered Dawn, the goddess Tyche picks $\theta$ from (what else?) an urn, according to $\pi(\theta)$. Tyche then draws $x$ from $p(X;\theta)$, and $x$ is revealed to us by the Sibyl or the whisper of oak leaves or sheep's livers. Then we calculate $\pi(\theta|X=x)$ and $C_{\alpha}$. In consequence, the fraction of days on which $\theta \in C_{\alpha}$ is about $\alpha$. $\alpha$ is how often the credible set is right, and $1-\alpha$ is one of those error rates you like to go on about. Does this myth satisfy you?

A: Not really. I get that "Bayesian analysis treats the parameters as random". In fact, that myth suggests a very simple yet universal Monte Carlo scheme for sampling from any posterior distribution whatsoever, without any Markov chains or burn-in.

Q: Can you say more?

A: I should actually write it up. But now let's try to de-mythologize. I want to know what happens if we get rid of Tyche, or at least demote her from resetting $\theta$ every day to just picking $x$ from $p(x;\theta)$, with $\theta$ fixed by Zeus.

Q: I think you mean Ananke, Zeus would meddle with the parameters to cheat on Hera. Anyway, what do you think happens?

A: Well, $C_{\alpha}$ depends on the data, it's really $C_{\alpha}(x)$. Since $x$ is random, $X \sim p(\cdot;\theta)$, so is $C_{\alpha}$. It follows a distribution of its own, and we can ask about $Pr_{\theta}(\theta \in C_{\alpha}(X) )$.

Q: Haven't we just agreed that that probability is just $\alpha$ ?

A: No, we've seen that $\int{Pr_{\theta}(\theta \in C_{\alpha}(X) ) \pi(\theta) d\theta} = \alpha$ but that is a very different thing.

Q: How different could it possibly be?

A: As different as we like, at any particular $\theta$.

Q: Could the 99% credible sets contain $\theta$ only, say, 1% of the time?

A: Absolutely. This is the scenario of Larry's playlet, but he wrote that up because it actually happened in a project he was involved in.

Q: Isn't it a bit artificial to worry about the long-run proportion of the time you're right about parameters?

A: The same argument works if you estimate many parameters at once. When the brain-imaging people do fMRI experiments, they estimate how tens of thousands of little regions in the brain ("voxels") respond to stimuli. That means estimating tens of thousands of parameters. I don't think they'd be happy if their 99% intervals turned out to contain the right answer for only 1% of the voxels. But posterior betting odds don't have to have anything to do with how often bets are right, and usually they don't.

Q: Isn't "usually" very strong there?

A: No, I don't think so. D. A. S. Fraser has a wonderful paper, which should be better known, called "Is Bayes Posterior Just Quick and Dirty Confidence", and his answer to his own question is basically "Yes. Yes it is." More formally, he shows that the conditions for Bayesian credible sets to have correct coverage, to be confidence sets, are incredibly restrictive.

Q: But what about the Bernstein-von Mises theorem? Doesn't it say we don't have to worry for big samples, that credible sets are asymptotically confidence sets?

A: Not really. It says that if you have a fixed-dimensional model, and the usual regularity conditions for maximum likelihood estimation hold, so that $\hat{\theta}_{MLE} \rightsquigarrow \mathcal{N}(\theta, n^{-1}I(\theta))$, and some more regularity conditions hold, then the posterior distribution is also asymptotically $\mathcal{N}(\theta, n^{-1}I(\theta))$.

Q: Wait, so the theorem says that when it applies, if I want to be Bayesian I might as well just skip all the MCMC and maximize the likelihood?

A: You might well think that. You might very well think that. I couldn't possibly comment.

Q: !?!

A: Except to add that the theorem breaks down in the high-dimensional regime where the number of parameters grows with the number of samples, and goes to hell in the non-parametric regime of infinite-dimensional parameters. (In fact, Fraser gives one-dimensional examples where the mis-match between Bayesian credible levels and actual coverage is asymptotically $O(1)$.) As Freedman said, if you want a confidence set, you need to build a confidence set, not mess around with credible sets.

Q: But surely coverage — "confidence" — isn't all that's needed? Suppose I have only a discrete parameter space, and for each point I flip a coin which comes up heads with probability $\alpha$. Now my $C_{\alpha}$ is all the parameter points where the coin came up heads. Its expected coverage is $\alpha$, as claimed. In fact, if I can come up with a Gygax test, say using the low-significance digits of $x$, I could invert that to get my confidence set, and get coverage of $\alpha$ exactly. What then?

A: I never said that coverage was all we needed from a set-valued estimator. It should also be consistent: as we get more data, the set should narrow in on $\theta$, no matter what $\theta$ happens to be. Your Gygax sets won't do that. My point is that if you're going to use probabilities ought to mean something, not just refer to some imaginary gambling racket going on in your head.

Q: I am not going to let this point go so easily. It seems like you're insisting on calibration for Bayesian credible sets, that the fraction of them covering the truth be (about) the stated probability, right?

A: That seems like a pretty minimal requirement for treating supposed probabilities seriously. If (as on Discworld) "million to one chances turn up nine times out of ten", they're not really million to one.

Q: Fine — but isn't the Bayesian agent calibrated with probability 1?

A: With subjective probability 1. But failure of calibration is actually typical or generic, in the topological sense.

Q: But maybe the world we live in isn't "typical" in that weird sense of the topologists?

A: Maybe! The fact that Bayesian agents put probability 1 on the "meager" set of sample paths where they are calibrated implies that lots of stochastic processes are supported on topologically-atypical sets of paths. But now we're leaning a lot on a pre-established harmony between the world and our prior-and-model.

Q: Let me take another tack. What if calibration typically fails, but typically fails just a little — say probabilities are really $p(1 \pm \epsilon )$ when we think they're $p$. Would you be very concerned, if $\epsilon$ were small enough?

A: Honestly, no, but I have no good reason to think that, in general, approximate calibration or coverage is much more common that exact calibration. Anyway, we know that credible probabilities can be radically, dismally off as coverage probabilities, so it seems like a moot point.

Q: So what sense do you make of the uncertainties which come out of Bayesian procedures?

A: "If we started with a population of guesses distributed like this, and then selectively bred them to match the data, here's the dispersion of the final guesses."

Q: You don't think that sounds both thin and complicated?

A: Of course it's both. (And it only gets more complicated if I explain "selective breeding" and "matching the data".) But it's the best sense I can make, these days, of Bayesian uncertainty quantification as she is computed.

A: I want to know about how differently the experiment, the estimate, could have turned out, even if the underlying reality were the same. Standard errors — or median absolute errors, etc. — and confidence sets are about that sort of uncertainty, about re-running the experiment. You might mess up, because your model is wrong, but at least there's a sensible notion of probability in there, referring to things happening in the world. The Bayesian alternative is some sort of sub-genetic-algorithm evolutionary optimization routine you are supposedly running in your mind, while I run a different one in my mind, etc.

Q: But what about all the criticisms of p-values and null hypothesis significance tests and so forth?

A: They all have Bayesian counterparts, as people like Andy Gelman and Christian Robert know very well. The difficulties aren't about not being Bayesian, but about things like testing stupid hypotheses, not accounting for multiple testing or model search, selective reporting, insufficient communication, etc. But now we're in danger of drifting really far from our starting point about uncertainty in estimation.

Q: Would you sum that up then?

A: I don't believe the uncertainties you get from just slapping a prior on something, even if you've chosen your prior so the MAP or the posterior mean matches some reasonable penalized estimator. Give me some reason to think that your posterior probabilities have some contact with reality, or I'll just see them as "quick and dirty confidence" — only often not so quick and very dirty.

Q: Is that what you're going to put in your referee report?

A: I'll be more polite.

Disclaimer: Not a commentary on any specific talk, paper, or statistician. One reason this is a failed attempt at a dialogue is that there is more Q could have said in defense of the Bayesian approach, or at least in objection to A. (I take some comfort in the fact that it's traditional for characters in dialogues to engage in high-class trolling.) Also, the non-existent Robberbaron-Bloodmoney University is not to be confused with the very real Carnegie Mellon University; for instance, the latter lacks a hyphen.