## December 31, 2015

### Books to Read While the Algae Grow in Your Fur, December 2015

Attention conservation notice: I have no taste. Also, this month when I wasn't reading textbooks on regression, I was doped to the gills on a mixture of binged TV shows, serial audio fiction and flu medicine.

Michael H. Kutner, Chris J. Nachtsheim and John Neter, Applied Linear Regression Models
J. J. Faraway, Linear Models with R
Sanford Weisberg, Applied Linear Regression
Having taught undergraduate linear regression for the first time this year, I had to pick a textbook, which meant reading a lot of them. These were the three I made it through cover to cover. Kutner et al. (henceforth KNW) is the one previously assigned for the class, and which we ended up keeping for reasons of continuity. Farawy's and Weisberg's were optional.
I have to say that over the course of the semester I came to really dislike KNW. The mathematical level is very low — I don't think anyone could read it and come away with any notion of why the numerator and the denominator in an $F$ test statistic are independent, or even that an $F$ test is a specialization of a likelihood ratio test. Which, OK, there's room for regression textbooks which aren't deeply into probability. But it has a most unhelpful devotion to things which were never more than kludges adapted to the computing hardware of 1950 or even 1920, like ANOVA tables, and endless attention to transformations which try to make things look more Gaussian and/or additive, never mind what they do to the interpretations. Against this, there are literally four pages on the bootstrap, and while leave-one-out cross-validation is mentioned, multi-fold CV isn't. It's almost as though the last forty years of statistics never happened. This in turn makes the explanation of regression trees fitting (another four pages, including examples) totally obscure. Now, I am sure that KNW know all this stuff perfectly well, but they don't teach it, at least not here, and I can't begin to fathom why. Even the data examples are small and antiquated and often just weak. (Who tries to predict house prices without information on location? Seriously, who?)
Both Faraway and Weisberg are superior in several ways: neither goes far into probability, but they move much faster, they are more up to date about things like Gaussianity, ANOVA tables, non-linear models, etc. (*), their computing is better, and their examples are more serious. Faraway has more material on shrinkage estimators (ridge regression, lasso) than Weisberg, and several chapters on experimental design, which Weisberg hardly touches on. On the other hand, Weisberg does have a more gentle opening with material on scatterplots and on "simple" regression (i.e., with one predictor variable). At least with undergrads, starting soft like that is probably a good idea.
None of the three books has an adequate discussion of causal inference, though again Faraway has the most; at least none of them say anything actively harmful on the matter. All three put model diagnostics after parametric inference within the model, which I realize is the traditional order but makes little sense — why bother testing whether such-and-such a slope is exactly zero if the model is rubbish in the first place? (**)
All three are outrageously priced, with KNW being by far the worst. (When the Revolution comes, Big Textbook won't be the first up against the wall, but they'll get a low number***.)
Clearly, I do not recommend KNW for self-study, though either Faraway or Weisberg should be fine. I would need a truly compelling reason to assign KNW again in the future. I would be happy to use either Faraway or Weisberg, leaning towards the former.
*: E.g., Weisberg (sec. 9.3, p. 204): "The assumption of normal errors plays only a minor role in regression analysis. It is needed primarily for inference with small samples, and even then the bootstrap ... can be used for inference. Furthermore, nonnormality of the unosbervable errors is very difficult to diagnose in small samples by examination of residuals." Or Faraway (sec. 3.2, p. 35): "It is not really necessary to specifically compute all the elements of the [ANOVA] table. As the originator of the table, Fisher said in 1931, it is 'nothing but a convenient way of arranging the arithmetic.' Since he had to do his calculations by hand, the table served a necessary purpose, but is not essential now."
**: Yes, there are circumstances where one might be interested in testing hypotheses about the best linear approximation, using a fixed set of variables, to the true regression. But then you couldn't test procedures which assume there's no approximation error! (Cf.)
***: Speaking of Big Textbook, my remarks are all about the 3rd edition of Weisberg, not the 4th edition, which I haven't seen. Some of the "new features" advertised by the publisher for the update, like covering the bootstrap, are actually in the 3rd edition.
Disclaimer: I am, supposedly, finishing my own textbook on statistics. But that book very deliberately presupposes a reader who has already gone through a course in linear regression, so it's not directly in competition with any of these.
J. Richard Büchi, Finite Automata, Their Algebras and Grammars: Towards a Theory of Formal Expressions (ed. Dirk Siefkes)
A presentation of automata theory, especially the theory of finite automata, as a branch of abstract algebra. This was a manuscript left incomplete at the time of Büchi's death, edited, but not completed, by Siefkes (e.g., there are references to never-written sections). While the writing is a bit self-indulgently opinionated (*), the interplay between algebraic and automata-theoretic ideas is very good. Though it's probably not a useful introduction to either abstract algebra or abstract automata, this would have been really good for me to have read in graduate school, and might still be helpful for some on-going projects.
*: File that under "takes one to know one".
Max Gladstone, Margaret Dunlap, Mur Lafferty and Brian Francis Slattery, Bookburners
Mind candy contemporary fantasy, in serial form, i.e., weekly installments of about 20--30 pages each. It's enjoyable, and also an interesting experiment with replicating in prose the TV-show structure of mixing forbidden-tome-of-the-week episodes with ones that advance a larger-scale plot. I liked the writing enough to track down other books by all four authors.
The Black Tapes
Imagine an NPR affiliate deciding to run a series about a skeptical paranormal investigator's unsolved cases. Now make the stories much creepier than whatever you were imagining, and let the stories start to over-lap disquietingly as the season progresses.... (Also, make the voice of the reporter something I can actually stand to listen to, unlike the usual NPR voices, which I find only slightly more pleasant than the sound of nails on a chalkboard.) The first season, all their currently is, ends on a cliff-hanger, but more is promised in January.
Welcome to Night Value
Mentioning The Black Tapes reminds me that I have been meaning to plug this podcast for years. (I believe I first found it through Kate Nepveu.) Each episode is, supposedly, about 25 minutes of community radio from the small high-desert American town of Night Vale. Night Vale is a town full of high school football stars, vague yet menacing government agencies, hipster record stores, hooded figures, dog parks, sheriff's secret policemen, school boards dominated by sentient glowing clouds (ALL HAIL), librarians, smiling gods, teenage girls with an intense devotion to literature, opera houses, miniature civilizations found beneath bowling lanes, condos, unsupported old oak doors appearing out of nowhere, etc., etc., etc. Some of the jokes build for months before coming to the pay-off. It's very much my sort of thing (I am not sure if the friend for whom I bought the "if you see something, say nothing and drink to forget" flask has quite forgiven me), even though I find the musical interludes (a.k.a. "the weather") almost uniformly forgettable.
Samuel Bowles, The New Economics of Inequality and Redistribution (in collaboration with Christina Fong, Herbert Gintis, Arjun Jayadev, and Ugo Pagano)
A short little collection of lectures (180 pages including preface and math-y appendices), drawing on Bowles's papers from the 2000s. (Hence the long list of collaborators.) The Big Idea here is that egalitarianism has got a lot more room for maneuver than the current conventional wisdom and economics (to the extent those are different) holds. "If I had to do a bumper sticker for the new economics of inequality it would be: INEQUALITY: IT DOESN'T WORK AND PEOPLE DON'T LIKE IT" (p. xiii).
The "people don't like it" part is the work of Bowles and collaborators on strong reciprocity, which I've gone on about at excessive length for many years.
My remarks on the "it doesn't work part" have grown excessive, so I'll try to spin that off into a post on its own.
ObDisclaimer: Sam is an acquaintance of long standing, both of us being affiliated with Santa Fe.
Person of Interest
Murder-of-the-week mind candy, with a flavoring of pre-crime. That the makers of the Machine had, apparently, never encountered the concept "false positive" is only too realistic, given what we know of how the national surveillance state thinks, but a few of them would have improved the show considerably.
The Librarians
Mind candy TV, adorable contemporary fantasy division.
(Somebody must have written a good essay on the way American pop culture tries to assimilate any sort of voluntary community to a nuclear family: who?)
Kathleen George, Simple
Mind candy mystery, continuing her series set in Pittsburgh; this time it's a politically motivated murder. (I refuse to call that a spoiler.) The characterization remains very good, and the jail scenes are convincing. The you-are-there details all concern neighborhoods I know very well, and are absolutely on target.
Lila Bowen, Wake of Vultures
Mind candy, young-adult western fantasy division. Enjoyable enough that I will keep an eye out for the inevitable sequel. (It's also a sign of the times that the protagonist of what is, in fact, a classically-formed western for young adults can be a part-Indian, part-black, bi-sexual girl raised in near-slavery who wants to be a boy.)
Fitzroy Maclean, Eastern Approaches
This is, apparently, a minor classic, and I can quite see why.
At the beginning, Maclean presents himself as a minor British diplomat in Paris in the 1930s, who got so bored with the diplomatic good life as to ask to be posted to Moscow. There, he spent a lot of time creeped out by the purges, and plotting to travel to Central Asia, which he managed to do twice. Those trips lead to some really nice passages, as well some which have been touched by the Racism Fairy. ( Maclean presents his trips to Central Asia as entirely his own initiative, and for all I know he may not have been a spy, but British intelligence would've been idiots not to (at the very least) have debriefed him very thoroughly afterwards.) On his return from Turkestan to Moscow he took in the trial of Bukharin, and was, for an upper class British conservative, surprisingly sympathetic to the Old Bolsheviks.
In the second part, Maclean gets himself out of the diplomatic service into the regular army as a private, manages to get elected to Parliament (!), and posted to Egypt, where he joins the fledgling SAS, and has adventures in North Africa fighting the Italians, none of which could be considered a tactical success. Abducting an Iranian general reputed to be in league with the Germans goes better, and provides the opportunity for an appreciation of Isfahan and its historic architecture.
The third, and best, part of the book is about his time with the Communist partisans in Yugoslavia. His obvious respect and affection for the Yugoslavs in general, and for Tito and the other partisans in particular, comes through quite clearly, and there are remarkable passages of writing here, both about the country he was in and about the war. He never goes out of his way to grind the reader's face in horrors, but he doesn't pretend it was at all a clean business, without ugliness, the way (mere) propaganda would. He also never goes out of his way to present himself in a heroic light, though it's clear he did a lot more than just radio in requests for parachuted supplies.
(Thanks to "ajay" at Unfogged for recommending this.)

Posted at December 31, 2015 23:59 | permanent link

## November 30, 2015

### Books to Read While the Algae Grow in Your Fur, November 2015

Attention conservation notice: I have no taste.

John Scalzi, The End of All Things
Mind candy science fiction, latest in the series begun with Old Man's War. At the surface level, it's a fun series of skiffy adventures, in which there are schemes, explosions, gadgets, secret lairs, etc., etc. Inter-textually, this is Scalzi sticking the knife in conversation with in Starship Troopers, having begun (in the first book) with a set-up which seems like a re-tread of Heinlein's ideology in that book, and then systematically having that universe collapse under the weight of (as my ancestors would've put it) its internal contradictions. I admit that taking pleasure in the latter aspect of the books is a recherché taste, and that I generally prefer my mind candy to be less inward-looking.
Sarah Vowell, Lafayette in the Somewhat United States
In which Vowell tackles the American Revolution, our memory of the Revolution, and how we owe our entire national existence to the French.
Kathleen George, Hideout
C. J. Lyons, Snake Skin
Two mind-candy mysteries, both, as it happens, set in Pittsburgh (*). George's is part of a continuing series of police procedurals, distinguished by really good characterization; here, the stand-out characters are the rather hopeless criminals. (One distinguishing feature of her books: the reader usually knows whodunnit very early on.) Lyons's, which I picked up by chance without realizing it had a local connection, is at once less elevated in its story-telling and more over-the-top in its action, but still passed the "I want to know what happens next" test.
*: Both are pretty good at the local color, at least by my standards as a mere ten-year resident rather than a real Yinzer. I admit I boggled at Lyons describing the immediate vicinity of the Pittsburgh Center for the Arts as a "blue collar" neighborhood, but then it occurred to me how rarely I got four blocks the that way from it...
Jeff VanderMeer, Acceptance
High quality mind candy science fiction/horror, sequel to Annihilation and Authority. Here, at the end, we get to see the marvels hidden inside the terrors — and inside the marvels, more terrors. I actually found the fragment of an explanation we got fairly satisfying, and liked that it was only a fragment, though I realize that tastes may differ here.
I tried this as a teenager, but don't think I got beyond the bit early in Book III where the God the Father starts monologuing his plans to the Son. On this attempt I listened to an excellent audiobook (read by Ralph Cosham), and I loved it. The language is magnificent, as is Milton's attempt to depict action on a more-than-terrestrial scale. (Though his standards for mind-boggling vastness are comically small, compared to the actual universe shown to us by astronomy.) The ideology is rubbish, of course. So: score one for approaching literary classics in maturity, rather than as a callow youth.
Stray thoughts, probably already immensely refined in the libraries written about this book: (1) Those are some really vivid accounts of how things looked, for a blind man. (2) Sometimes it seems like Milton's trying to excise classical-mythological allusions in favor of Biblical ones (e.g., the places named in the invocation of the heavenly Muse at the very opening), but it's like he just can't stay away from them. (3) Similarly, I think there are very few historical or contemporary-geographical allusions to places in Europe, compared to quite striking ones for Asia (e.g., X 431ff) and even Africa ("Serraliona", X 703); if that's right, why?
Finally: I kept thinking, as I was immersed in this book about a creature bent on vengeance against its all-powerful creator, "What would Justice of Torren One Esk Nineteen make of this?"
(N.) Lee Wood, Kingdom of Lies and Kingdom of Silence
Mind candy mystery novels. The first is a combination of a procedural and an amateur-sleuth mystery; the second is just a procedural. They're well-told, with good characterization, but too many coincidences for me to be completely satisfied in the mysteries. (Picked up because of the quality of Wood's older science fiction and fantasy, particularly Looking for the Mahdi and Bloodrights.)

Posted at November 30, 2015 23:59 | permanent link

## November 17, 2015

### Course Announcement: 36-402, Advanced Data Analysis, Spring 2016

Attention conservation notice: Only relevant if you are a student at Carnegie Mellon University, or have a pathological fondness for reading lecture notes on statistics.

In the so-called spring, I will again be teaching 36-402 / 36-608, undergraduate advanced data analysis:

The goal of this class is to train you in using statistical models to analyze data — as data summaries, as predictive instruments, and as tools for scientific inference. We will build on the theory and applications of the linear model, introduced in 36-401, extending it to more general functional forms, and more general kinds of data, emphasizing the computation-intensive methods introduced since the 1980s. After taking the class, when you're faced with a new data-analysis problem, you should be able to (1) select appropriate methods, (2) use statistical software to implement them, (3) critically evaluate the resulting statistical models, and (4) communicate the results of your analyses to collaborators and to non-statisticians.

During the class, you will do data analyses with existing software, and write your own simple programs to implement and extend key techniques. You will also have to write reports about your analyses.

Graduate students from other departments wishing to take this course should register for it under the number "36-608". Enrollment for 36-608 is very limited, and by permission of the professors only.

Prerequisites: 36-401, with a grade of C or better. Exceptions are only granted for graduate students in other departments taking 36-608.

This will be my fifth time teaching 402, and the fifth time where the primary text is the draft of Advanced Data Analysis from an Elementary Point of View. (I hope my editor will believe that I don't intend for my revisions to illustrate Zeno's paradox.) It is the first time I will be co-teaching with the lovely and talented Max G'Sell.

Unbecoming whining: When I came to CMU, a decade ago, 402 was a projects class for about 10 students. It was larger than that when I inherited it.

 Year Students receiving final grades 2011 69 2012 88 2013 90 2015 115
Since there are about 160 students in the pre-req class, I don't see how we can reasonably expect to get away with less than 140 of them continuing on to 402. (Even I can't teach 401 so badly than an eighth of them will get below a C.) This will mean at least six straight years of uninterrupted growth for 402, to the point where about 1/50 of the total undergraduate population will be taking it in the spring (and maybe 1/8 of all our undergrads will pass through it at some point). This has, of course, nothing to do with my qualities as an instructor, and everything to do with the apparently unstoppable increase in the number of students majoring in statistics and its kin. "Class sizes are doubling every five years" is clearly a better problem to have than "class sizes are halving every five years" (*), but we can neither count on the trend continuing for long, nor keep teaching in the same way if I think I can more or less continue with the same plan I had when the class was half as large, but if this goes on something is going to have to change. It's clearly a better problem to have than the reverse (*), but *: As I have said in a number of conversations over recent years, the nightmare scenario for statistics vs. "data science" is that statistics becomes a sort of mathematical analog to classics. People might pay lip-service to our value, especially people who are invested in pretending to intellectual rigor, but few would actually pay attention to anything we have to say.

Posted at November 17, 2015 22:54 | permanent link

## November 09, 2015

### "Inference in the Presence of Network Dependence Due to Contagion" (Next Week at the Statistics Seminar)

Attention conservation notice: Only of interest if you (1) care about statistical inference with network data, and (2) will be in Pittsburgh next week.

A (perhaps) too-skeptical view of statistics is that we should always think we have $n=1$, because our data set is a single, effectively irreproducible, object. With a lot of care and trouble, we can obtain things very close to independent samples in surveys and experiments. When we get to time series or spatial data, independence becomes a myth we must abandon, but we still hope that we can break up the data set into many nearly-independent chunks. To make those ideas plausible, though, we need to have observations which are widely separated from each other. And those asymptotic-independence stories themselves seem like myths when we come to networks, where, famously, everyone is close to everyone else. The skeptic would, at this point, refrain from drawing any inference whatsoever from network data. Fortunately for the discipline, Betsy Ogburn is not such a skeptic.

Elizabeth Ogburn, "Inference in the Presence of Network Dependence Due to Contagion"
Abstract: Interest in and availability of social network data has led to increasing attempts to make causal and statistical inferences using data collected from subjects linked by social network ties. But inference about all kinds of estimands, starting with simple sample means, is challenging when only a single network of non-independent observations is available. There is a dearth of principled methods for dealing with the dependence that such observations can manifest. We describe methods for causal and semiparametric inference when the dependence is due solely to the transmission of information or outcomes along network ties.
Time and place: 4--5 pm on Monday, 16 November 2015, in 1112 Doherty Hall

As always, the talk is free and open to the public.

Posted at November 09, 2015 22:14 | permanent link

### "Statistical Estimation with Random Forests" (This Week at the Statistics Seminar)

Attention conservation notice: Only of interest if you (1) are interested in seeing machine learning methods turned (back) into ordinary inferential statistics, and (2) will be in Pittsburgh on Wednesday.

Leo Breiman's random forests have long been one of the poster children for what he called "algorithmic models", detached from his "data models" of data-generating processes. I am not sure whether developing classical, data-model statistical-inferential theory for random forests would please him, or has him spinning in his grave, but either way I'm sure it will make for an interesting talk.

Stefan Wager, "Statistical Estimation with Random Forests"
Abstract: Random forests, introduced by Breiman (2001), are among the most widely used machine learning algorithms today, with applications in fields as varied as ecology, genetics, and remote sensing. Random forests have been found empirically to fit complex interactions in high dimensions, all while remaining strikingly resilient to overfitting. In principle, these qualities ought to also make random forests good statistical estimators. However, our current understanding of the statistics of random forest predictions is not good enough to make random forests usable as a part of a standard applied statistics pipeline: in particular, we lack robust consistency guarantees and asymptotic inferential tools. In this talk, I will present some recent results that seek to overcome these limitations. The first half of the talk develops a Gaussian theory for random forests in low dimensions that allows for valid asymptotic inference, and applies the resulting methodology to the problem of heterogeneous treatment effect estimation. The second half of the talk then considers high-dimensional properties of regression trees and forests in a setting motivated by the work of Berk et al. (2013) on valid post-selection inference; at a high level, we find that the amount by which a random forest can overfit to training data scales only logarithmically in the ambient dimension of the problem.
(This talk is based on joint work with Susan Athey, Brad Efron, Trevor Hastie, and Guenther Walther.)
Time and place: 4--5 pm on Wednesday, 11 November 2015 in Doherty Hall 1112

As always, the talk is free and open to the public.

Posted at November 09, 2015 16:23 | permanent link

## November 03, 2015

### Kriging in Perspective (Teaching outtakes)

Attention conservation notice: 11 pages of textbook out-take on statistical methods, either painfully obvious or completely unintelligible.

I wrote up some notes on kriging for use in the regression class, but eventually decided teaching that and covariance estimation would be too much. Eventually I'll figure out how to incorporate it into the book, but in the meanwhile I offer it for the edification of the Internet.

Posted at November 03, 2015 19:00 | permanent link

### Housekeeping Notes

Blogging will remain sparse while I teach, finish the book, write grant proposals, try not to screw up being involved in a faculty search, do all the REDACTED BECAUSE PRIVATE things, and dream about research. In the meanwhile:

A Twitter account, opened at Tim Danford's instigation. This is a semi-automated new account which is just for announcing new posts here; it (and I use the pronoun deliberately) follows no one, I read nothing, and messages or attempts to engage might as well be piped to /dev/null.

My online notebooks are in the same process of incremental update they've been for the last 21 years.

My on-going bookmarking, with short commentary. (Pinboard doesn't need my unsolicited endorsement, but has it.)

Tumblr, for pictures.

Posted at November 03, 2015 17:00 | permanent link

## October 31, 2015

### Books to Read While the Algae Grows in Your Fur, October 2015

Attention conservation notice: I have no taste.

Anne M. Pillsworth, Fathomless
Mind candy, sequel to Summoned (which I seem not to have blogged about), being the further education of a Lovecraftian sorcerer. Pillsworth tries very hard to maintain faithfulness to the canon, but with a sensibility which is a just a bit less freaked by its own attraction to the not-like-me than Lovecraft was. It's clearly aimed at younger readers, but I'm not sure how many of them will have read enough eighty-year-old stories to appreciate it.
Mur Laffery, The Shambling Guide to New York City and Ghost Train to New Orleans
Mind candy: the adventurous life of a travel-book editor, who discovers that the big city is, in fact, full of monsters — and she is, arguably, one of them.
Shirley Jackson, We Have Always Lived in the Castle
I remember this being a favorite book as a teenager, but I'd not read it for decades. It turns out I'd forgotten the last half or so, and it blew me away, again.
Oscar Kempthorne, Design and Analysis of Experiments
Very old-school, but very clear, experimental design; Kempthorne is extremely sound on the role of randomization, and what it does and does not let one estimate. Reading this now, it's amazing just how little one could actually calculate back then, outside of additive-and-Gaussian models, and so how much of the formal machinery was really about simplifying calculations. (Look at the gyrations he goes through to avoid having to explicitly invert matrices when getting least-squares estimates.)
Jeffrey E. Barlough, The House in the High Wood: A Story of Old Talbotshire
Mind candy, of a very odd sort; only semi-recommended. On the surface, it's a dark historical fantasy set in rural 19th century England, complete with scenes of village life and a haunted mansion. The deeper in one goes, the more elements appear which are bizarre even for such a book — elements which are never explained. My best guess — n cnenyyry jbeyq jubfr uhzna vaunovgnagf ner qrfpraq sebz crbcyr jub pnzr sebz Ivpgbevna Oevgnva naq uryq ba gb gubfr zberf gb n evqvphybhf rkgrag (qryvorengr geniryref? fangpurq ol fbzr zlfgrevbhf sbepr? zreryl ivpgvzf bs enaqbz vagreqvzrafvbany jrveqarff?), cyhf n ybg bs ceruvfgbevp navznyf rkgvapg va bhe jbeyq, oebhtug bire ol gur fnzr cebprff — turns out to be not what the author had in mind, though not that far off either. This setting, I have to say, did nothing for me, but I can see how many would like it (*), and Barlough certainly has real skills as a novelist.
*: Bgure crbcyr zvtug fcrphyngr ba gur nccrny bs na vzntvanel jbeyq jurer gur fbyr yvtug bs pvivyvmngvba vf na Nzrevpna jrfg pbnfg vf ragveryl vaunovgrq ol JNFCf, jub qvqa'g rira unir gb rkgrezvangr nal angvirf gb trg gur ynaq, ohg jung qb V xabj nobhg Oneybhtu'f zbgvirf, be gur sne zber inevbhf barf bs uvf snaf? Jung V pna fnl pbasvqragyl gung, nf n jbex bs fcrphyngvir svpgvba, gur jbeyq-ohvyqvat vf ynhtunoyl jrnx. Gung na nygreangr irefvba bs bhe jbeyq jurer gur Vpr Ntrf arire raqrq, jurer gurer vf ab thacbjqre, naq jurer gur Nzrevpnf jrer havaunovgrq orsber Rhebcrnaf cynagrq frggyre pbybavrf, jbhyq unir n Oevgnva, zhpu yrff bar jubfr phygher va 1839 jnf whfg yvxr jung vg jnf urer, fubjf n gbgny snvyher bs uvfgbevpny frafr. Guvf vf bayl zngpurq ol gur vqrn gung n praghel naq n unys yngre, nsgre n tybony raivebazragny pngnfgebcur vapyhqvat, nzbat bgure guvatf, gur gbgny qvfehcgvba bs nyy ybat-qvfgnapr genqr, gung phygher jbhyq erznva pbzcyrgryl hapunatrq. (Naq vg qbrfa'g rira frrz gb or gung gurl bayl guvax gurl'ir cerfreirq guvatf hapunatrq.) Jbeyq-ohvyqvat vf, bs pbhefr, abg gur bayl iveghr sbe fcrphyngvir svpgvba --- zhpu gur fnzr pevgvpvfzf nccyl, zhgngvf zhgnaqvf, gb Anbzv Abivx'f vzzrafryl sha Ancbyrbavp frn qentba fgbevrf --- ohg urer vg xrcg wneevat zr.
I say this as someone who likes the idea of a North America which still has all the old Pleistocene megafauna.
Peter Straub, Houses without Doors
Mind candy: a collection of his horror stories, though two of these are really too long to be stories ("The Buffalo Hunter", 130 pages; "Mrs. God", 166). "Blue Rose" and "The Juniper Tree" were fine (they relate to Straub's novels, but stand alone); I did not care for "The Buffalo Hunter" at all. "A Short Guide to the City" is creepy (*), as is "Something About a Death, Something About a Fire"; in neither story does much of anything at all happen. "Mrs. God", finally, is a Gothic extravagance with a haunted stately house, hostile villagers, mysterious manuscripts, eerie parallels across generations, morally and biologically decayed aristocrats, a viewpoint character who doesn't so much have perceptions as a continuous running pathetic fallacy, and, because this is Straub, poetry. (Also, again because this is Straub, no explanations of anything at all.)
*: And, I'm afraid, just a bit racist in the way it describes the South Siders. Which is a shame, because that bit is also one of the best parts of the story.
Ann Leckie, Ancillary Mercy
At the end of the last volume, I thought there was no way this series could be satisfyingly finished in one more book. I should have had more trust in the author.
Spoiler-ish comments: V qvqa'g frr ubj Oerd pbhyq cbffvoyl jva ntnvafg Nannaqre Zvnannv --- abe qvq V guvax gung nsgre nyy guvf, Yrpxvr jnf tbvat gb unir Oerd hygvzngryl qrsrngrq. (Gubhtu n zrnare nhgube zvtug unir.) Jung V qvq abg pbhag ba jnf Oerd'f zvffvba bs crefbany ergevohgvba ribyivat vagb na NV yvorengvba zbirzrag, phyzvangvat va sbhaqvat gur Phygher.
Seth Dickinson, The Traitor Baru Cormorant
Mind candy fantasy epic. I picked this up on the recommendation of Kameron Hurley, and was not disappointed: it is the only fantasy novel I have run across which turns on questions of economics and imperialism, and still manages to avoid cynicism. (Which, come to think of it, is hard with realist fiction.) Further comments ROT-13'd for spoilers: Gung Oneh jbhyq orgenl gur eroryyvba jnf boivbhf rabhtu gb zr sebz gur zbzrag gur ercerfragngvir bs gur Uvqqra Znfgref znqr pbagnpg jvgu ure --- uryy, boivbhf rabhtu sebz gur gvgyr. Rira gur trareny angher bs gur svany grfg jnf boivbhf. Naq lrg vg fgvyy jnf dhvgr nssrpgvat, qrfcvgr zl univat abguvat crefbanyyl vairfgrq Oneh'f cnegvphyne inevrgl bs sbeovqqra ybir.
Whether Baru emerges at the end triumphant yet tragic, or merely tragic, I hesitate to say.
Raziuddin Aquil, Sufism, Culture and Politics: Afghans and Islam in Medieval North India
This divides fairly cleanly into three parts. The first is about the history of Sher Shah Sur, who, depending on your perspective, either was the successor to the Afghan (=Pashtun, pretty much) Lodi dynasty as sultan of Delhi and emperor of Hindustan, or was a rebel against the Timurids, temporarily expelling Humayun and setting the stage for Akbar. Aquil does a good job of setting out all the accouns from all the primary sources, which left me, at least, in a great deal of doubt about what exactly happened when. The second part is about the administration of the Afghan dynasties and their incorporation of local Rajputs into their imperial project. The third is about the political role of Sufi orders (including their stories about beating Hindu yogis in displays of supernatural force; disappointingly, Aquil does not inquire what stories contemporary yogis told about sufis) and the role of sufis in cross-religious syncretism. These are only loosely coupled to each other, though there are some connections.
Aquil presumes a reader familiar with at least the outlines of the political and religious history of northern India during the 15th and 16th centuries, and makes no concession to ignorance on this score. (I am not ashamed to admit how much I relied on my memories of Amar Chitra Katha comics read as boy.) With even the minimal necessary background, however, he has some fascinating things to say, both about massive empires rising and falling over little more than a decade, and how this was intertwined with both profound mystical spirituality and gross superstition (with, naturally, the superstition predominating).

Posted at October 31, 2015 23:59 | permanent link

## September 30, 2015

### Books to Read While the Algae Grow in Your Fur, September 2015

Attention conservation notice: I have no taste.

Linda Nagata, The Trials
Sequel to First Light, where the consequences of that adventure come home to roost. — If I say that these novels are near-future military hard science fiction, full of descriptions of imaginary technologies and of stuff blowing up, and clearly inspired by an anxious vision of America's ongoing decline, I am being perfectly truthful, and yet also quite misleading. People who enjoy books which fall under that rubric will find it very much the sort of thing they like; at the same time, normally I'd pay to avoid having to read such works, and yet found these two quite compelling, and eagerly await the conclusion.
Letizia Battaglia, Passion, Justice, Freedom --- Photographs of Sicily
Battaglia comes across as a bit of a crazy woman, but in a deeply admirable way; and, of course, a tremendous photographer.
Paul McAuley, In the Mouth of the Whale
Hard-SF space opera, set in the same future as his terrific The Quiet War and Gardens of the Sun, but many centuries later. (He's good at filling in enough of the back-story to make it separately readable.) In this book, we're plunged into a conflict over the star system around Fomalhaut among four different more-or-less-post-more-or-less-human clades, seen from three points of view, two of which prove to be peripheral grunts. (Spoiler: Jung, rneyl ba, nccrnef gb or bar bs gur zbfg uhzna ivrjcbvagf cebirf, va snpg, gb or cebsbhaqyl fgenatr, gubhtu guvf vf fbzrguvat ernqref bs gur cerivbhf obbx pbhyq thrff.) I thought it was very good, though not quite as great as those two earlier books.
Edward K. Muller (ed.), An Uncommon Passage: Traveling through History on the Great Allegheny Passage Trail
A decent collection of essays, and really pretty photos, on the natural and human history of what is today a bike route from Pittsburgh to Cumberland, Maryland (and so on to Washignton, D.C.), but has had a lot of other incarnations over the centuries. Of only local interest, but locally interesting.
ObSnapshots: From a bike trip last year.
Gillian Flynn, Dark Places
Mind candy mystery: In which the Satanic panic of the 1980s meets the economic collapse of family farming, and makes for something bitterly poisonous and engrossing. (Though arguably not as poisonous as some of what actually happened back then.)
Carolyn Drake, Wild Pigeon
Photos, collages and a translated story, meant to illustrate the contemporary life of the Uighurs in Xinjiang. Bought from the author; I learned about it from the New York Review blog.
Fernand Braudel, The Wheels of Commerce
I picked up this middle volume of a trilogy, without having read the first book, because someone left it in a free-books pile at work, and I was curious. Whoever got rid of their copy: thanks. This is a truly fascinating look at the development of the market economy and capitalism in early modern Europe, and to some extent in the rest of the old world at the same time, full of fascinating information (*) and perspectives, as well as chewy and questionable hypotheses.
One notable feature, for me, is that Braudel wants to distinguish between the development of a market economy and the development of capitalism. He does this not to suggest an early-modern pre-history for market socialism, but because he identifies capitalism with "the realm of investment and of a high rate of capital formation", i.e., the activities of men, and of firms, who made substantial investments of money which resulted, or could result, in high rates of return. This was, in this period, in finance (especially financing the developing sovereign territorial states), in long-distance trade, and in monopolies. These were activities which could hardly have gotten off the ground without a large market economy around them, but where competition was precisely what one would want to avoid...
I wish someone had told me before this that Braudel was a good writer, and not just an important historian. Also: I'd have given a lot to see what he might have made of the "new international trade theory" and "new economic geography", which were just forming at the time he was writing.
*: The bit on p. 556 where he says that a "prohibition on lending at interest" was a "condition not present in Islam" was rather boggling, and does leave me wondering about the accuracy of some of his other statements.
Sarah Vowell, Unfamiliar Fishes
The story of the American conquest of Hawaii, told in Vowell's signature style. (It works better read aloud than on the silent page.) With many thanks to "Uncle Jan" for my copy.
Iain M. Banks, Surface Detail
Mind candy: space opera, in which the Culture, in its own inimitable fashion, harrows Hell. Somewhat longer, I think, than it needed to be, but still compulsively readable.
Amanda Downum, Dreams of Shreds and Tatters
Mind candy, at the urban fantasy / horror border, in which Vancouver's art scene confronts an outbreak from the dungeon dimensions — or, more exactly, Carcosa. I quite enjoyed how Downum is able to use pretty much the full canonical Cthulhu Mythos, from the seventy steps down to the Dreamlands to night-gaunts and everything else, and manage to make it seem not a formulaic exercise but genuinely creepy. (And I mean "creepy" in the "hairs standing on the back of the neck" sense, not the "bigoted distant connection at Thanksgiving" [*] sense, which says something considering the source material.) I have the impression this novel didn't make much of an impact when it came out, but if so that's unfair.
*: Of course I'm not thinking of you, dear distant connection with whom I have shared Thanksgiving.
Kelley Armstrong, Deceptions
Mind-candy contemporary fantasy in which discovering that her biological parents are convicted serial killers is the least of the protagonist's problems. (Previously.)
Dana Goldstein, The Teacher Wars: A History of America's Most Embattled Profession
This is a very nicely done popular history of not just the teaching profession but also of the public schools, and just why both have been such a point of political contention for so long — and why we keep trying incredibly similar fixes time after time. Because it's not an academic tome, it doesn't attempt to be altogether comprehensive, rather a series of portraits of particular episodes, but so far as an interested non-expert can judge, those episodes are well-chosen and the background to the portraits accurate.
(I read this a year ago, but forgot to blog it.)

Posted at September 30, 2015 23:59 | permanent link

## September 26, 2015

### "Robust Confidence Intervals via Kendall's Tau for Transelliptical Graphical Models" (Next Week at the Statistics Seminar)

Attention conservation notice: Publicity for an upcoming academic talk, of interest only if (1) you care about quantifying uncertainty in statistics, and (2) will be in Pittsburgh on Monday.

I am late in publicizing this, but hope it will help drum up attendance anyway:

Mladen Kolar, "Robust Confidence Intervals via Kendall's Tau for Transelliptical Graphical Models"
Abstract: Undirected graphical models are used extensively in the biological and social sciences to encode a pattern of conditional independences between variables, where the absence of an edge between two nodes $a$ and $b$ indicates that the corresponding two variables $X_a$ and $X_b$ are believed to be conditionally independent, after controlling for all other measured variables. In the Gaussian case, conditional independence corresponds to a zero entry in the precision matrix $\Omega$ (the inverse of the covariance matrix $\Sigma$). Real data often exhibits heavy tail dependence between variables, which cannot be captured by the commonly-used Gaussian or nonparanormal (Gaussian copula) graphical models. In this paper, we study the transelliptical model, an elliptical copula model that generalizes Gaussian and nonparanormal models to a broader family of distributions. We propose the ROCKET method, which constructs an estimator of $\Omega_{ab}$ that we prove to be asymptotically normal under mild assumptions. Empirically, ROCKET outperforms the nonparanormal and Gaussian models in terms of achieving accurate inference on simulated data. We also compare the three methods on real data (daily stock returns), and find that the ROCKET estimator is the only method whose behavior across subsamples agrees with the distribution predicted by the theory. (Joint work with Rina Foygel Barber.)
Time and place: 4--5 pm on Monday, 28 September 2015, in Doherty Hall 1112.

As always, the talk is free and open to the public.

Posted at September 26, 2015 23:58 | permanent link

### On the Nature of Things Humanity Was Not Meant to Know

Attention conservation notice: A ponderous, scholastic joke, which could only hope to be amusing to those who combine a geeky enthusiasm for over-written horror stories from the early 20th century with nerdy enthusiasm for truly ancient books.

I wish to draw attention to certain parallels between De Rerum Natura, an ancient epic and didactic poem expounding a philosophy which is blasphemous according to nearly* every religion, and the Necronomicon, a fictitious book of magic supposedly expounding a doctrine which is blasphemous according to nearly** every religion.

The Necronomicon was, of course, invented by H. P. Lovecraft for his stories in the 1920s and 1930s. In his mythos, it was written by the mad poet "Abdul Alhazred", who died in +738 by being torn apart by invisible monsters. The book then led a twisty life through a thin succession of manuscript copies and translations, rare and almost lost. The book was, supposedly, full of the horrible, nearly indescribable, secrets of the universe: explaining how the world is an uncaring yet quite material place, in which the Earth's past and future are full of monsters, but natural monsters, how the reign of humanity is a transient episode, and the gods are in reality powerful extra-terrestrial beings, without any particular care for humanity. Reading the Necronomicon drives one mad, or at the very least the frightful knowledge it imparts permanently warps the mind. There are, supposedly, about half-a-dozen copies in existence, kept under lock and key (except when the story requires otherwise).

De Rerum Natura ("On the Nature of Things") is an entirely real book, written by the poet Titus Lucretius Carus around -55; according to legend, the poet went mad and died as a result of taking a love potion. The book thereafter led a twisty life through a thin trail of manuscript copies, and was almost lost over the course of the middle ages. The book is quite definitely full of what Lucretius thought of as the secrets of the universe (whose resistance to description is a running theme): how the entire universe is material and everything arises from the fortuitous concourse of atoms, how every phenomenon not matter how puzzling has a rational and material explanation, how there is no after-life to fear. It describes how the Earth's past was full of thoroughly-natural monsters, the reign of humanity and even the existence of the Earth is a transient episode, and how the gods are in reality powerful extra-terrestrial beings without any particular care for humanity, living (a Lovecraftian touch) in the spaces between worlds. In the centuries since its recovery, it has been retrospectively elevated into one of the great books of the Western civilization (whatever that is).

If we are to believe the latest historian of its reception, reading De Rerum Natura started out as an innocent pursuit of more elegant Latin, but ended up permanently warping the greatest minds of Renaissance Europe. The inescapable conclusion is that the Enlightenment is the result of the real-life Necronomicon, a book full of things humanity was not meant to know, using the printing revolution of early modern Europe to take over the intellectual world, until (in the words of the lesser poet) "all the earth ... flame[d] with a holocaust of ecstasy and freedom". Of course the same thing looks different from the point of view of us cultists:

And thus you will gain knowledge, guided by a little labor,
For one thing will illuminate the next, and blinding night
Won't steal your way; all secrets will be opened to your sight,
One truth illuminate another, as light kindles light.

*: I insert the qualifier for the sake of my Unitarian Universalist friends. ^

**: I insert the qualifier for the sake of my Unitarian Universalist friends. ^

Spoiling the conceit: I have no reason to believe that Lovecraft was thinking of Lucretius at any point in writing any of his stories featuring the Necronomicon, or even that the history of De Rerum Natura influenced the "forbidden tome" motif which Lovecraft drew on (and amplified). I also do not think that the Enlightenment is really about "shouting and killing and revelling in joy". (Though it would be its own kind of betrayal of the Enlightenment for one of its admirers, like me, not to face up to the ways some of its ideas have been used to justify very great evils, particularly when Europeans imposed themselves on less powerful peoples elsewhere.) Rather, this is all the result of the collision in my head of Ada Palmer's interview by Henry Farrell with Palmer's earlier appreciation of Ruthanna Emrys's "Litany of Earth", plus Ken MacLeod's cometary Lucretian deities, and early imprinting on Bruce Sterling.

Finally, I would pay good money to read the alternate history where it was the Necronomicon which humanists discovered mouldering in a monastic library and revived, where its ideas are as thoroughly normalized, pervasive and surpassed as Lucretius's are, and copies of Kitab al-Azif can be found in any bookstore as a Penguin Classic, translated by a distinguished contemporary poet. Failing that, I would like to read Lucretius's explanation of why we need have no fear of shoggoths.

Posted at September 26, 2015 23:30 | permanent link

## September 04, 2015

### "Reproducibility and Reliability in Statistical and Data Driven Research" (Week after Next Coming Soon at the Statistics Seminar)

Attention conservation notice: Publicity for an upcoming academic talk, of interest only if (1) you will be in Pittsburgh and (2) you care about whether scientific research can be reproduced.

The timeliness of the opening talk of this year's statistics seminar is, in fact, an un-reproducible, if welcome, coincidence:

Victoria Stodden, "Reproducibility and Reliability in Statistical and Data Driven Research"
Abstract: The reproducibility and computational inferences from data is widely recognized as an emerging issue in the scientific reliability of results. This talk will motivate the rationale for this shift, and outline the problem of reproducibility. I will then present ongoing research on several solutions: empirical research on data and code publication; the pilot project for large scale validation of statistical findings; and the "Reproducible Research Standard" for ensuring the distribution of legally re-usable data and code. If time permits, I will present early results assessing the reproducibility of published computational findings. Some of this research is described in my co-edited books, Implementing Reproducible Research and Privacy, Big Data, and the Public Good.
Time and place: 4--5 pm on Monday, 14 September 2015, in Doherty Hall 1112 see below

As always, the talk is free and open to the public.

Update, 14 September: Prof. Stodden's talk has had to be rescheduled; I will post an update with the new date once I know it.

Posted at September 04, 2015 13:19 | permanent link

## August 31, 2015

### Books to Read While the Algae Grow in Your Fur, August 2015

Attention conservation notice: I have no taste.

Roland and Sabrina Michaud, Mirror of the Orient
The Michauds' gorgeous photos from the 1960s and 1970s — mostly of Afghanistan, but also Turkey, Iran, and India — aptly paired with Persianate miniature paintings. This is a wonderful book I have coveted for many years, and I am very pleased to have finally scored a copy I could afford.
Alain Barrat, Marc Barthelemy and Alessandro Vespignani, Dynamical Processes on Complex Networks
Survey of the state of the field as of 2008. It is decent and generally clear, if not especially fast-paced, and covers ideas about network structure, percolation, synchronization of oscillators, epidemic models, diffusion of innovations (mapped on to epidemic models), and Kauffman's Nk model in some detail. (They're pretty good on linkages between these.) On other biological processes they are vaguer.
I found the emphasis on results presuming exact power-law degree distributions less than compelling, and the apologia for this emphasis in the conclusion surprisingly wrong-headed. (It does no good to defend them as approximations unless you also show that conclusions continue to hold when the assumptions are in fact only approximately true --- that there is, as Herbert Simon once put it, continuity of approximation. And in many cases, you'd need very robust continuity of approximation indeed.) But I recognize that I am abnormally picky about this subject.
ObDisclaimer: I've met Prof. Vespignani once or twice, but I don't think I've ever met or corresponded with the other authors.
Kelley Armstrong, Sea of Shadows and Empire of Night
Mind candy: First two-thirds of a fantasy trilogy about the adventures of a pair of teenage shamans. It's surprisingly enjoyable, with surprisingly effective monsters. The human setting is inspired not by a vaguely feudal Europe, but by more-or-less Heian-era Japan, though there seems to be no equivalent of Buddhism (maybe the bit with the monks in the second book?), and making the !Ainu blonds and redheads hints at pandering to the audience.
Arthur E. Albert and Leland A. Gardner, Jr., Stochastic Approximation and Nonlinear Regression
This is all about on-line learning and stochastic gradient descent before it was cool:
This monograph addresses the problem of "real-time" curve fitting in the presence of noise, from the computational and statistical viewpoints. Specifically, we examine the problem of nonlinear regression where observations $\{Y_n: n= 1, 2, \ldots \}$ are made on a time series whose mean-value function $\{ F_n(\theta) \}$ is known except for a finite number of parameters $(\theta_1, \theta_2, \ldots \theta_p) = \theta^\prime$. We want to estimate this parameter. In contrast to the traditional formulation, we imagine the data arriving in temporal succession. We require that the estimation be carried out in real time so that, at each instant, the parameter estimate fully reflects all of the currently available data.
The conventional methods of least-squares and maximum-likelihood estimation ... are inapplicable [because] ... the systems of normal equations that must be solved ... are generally so complex that it is impractical to try to solve them again and again as each new datum arrives.... Consequently, we are led to consider estimators of the "differential correction" type... defined recursively. The $(n+1)$st estimate (based on the first $n$ observations) is defined in terms of the $n$th by an equation of the form $t_{n+1} = t_n + a_n[Y_n - F_n(t_n)]$ where $a_n$ is a suitably chosen sequence of "smoothing" vectors.
(It's not all time series though: section 7.8 sketches applying the idea to experiments and estimating response surfaces.) Accordingly, most of the book is about coming up with ways of designing the $a_n$ to ensure consistency, i.e., $t_n \rightarrow \theta$ (in some sense), especially $a_n$ sequences which are themselves very fast to compute.
Mathematically, of course, we've got much more powerful machinery for proving theorems about stochastic approximation these days, but Albert and Gardner's methods seem particularly clear to me. Also, it's more fun to think of these tools being used to estimate the orbital elements of satellites (as in the lovingly-detailed section 8.5) than for ad targeting, a.k.a. commercialized surveillance.
Xavier Guyon, Random Fields on a Network: Modeling, Statistics, and Applications
Lots of overlap with Gaetan and Guyon's Spatial Statistics and Modeling (unsurprisingly), though omitting point processes and going at greater depth into the math of random fields (e.g., spectral representations) on, mostly, regular lattices. I suspect most readers would be better served by the later book, but this is a useful reference for me.
Mircea Eliade, Cosmos and History: The Myth of the Eternal Return
My brief comments outgrew their bounds; I will try to bring them under some kind of control soon.
Paula Volsky, Illusion
An old favorite, re-read after a long interval. It holds up. So: if you'd like to read a secondary-world fantasy novel where a magic kingdom gets visited by a horrific and entirely deserved version of the French Revolution, with well-drawn characters on all sides, written by an author who clearly learned great lessons from Jack Vance but has very much her own voice, track this down.
Ta-Nehisi Coates, Between the World and Me
Commentary outsourced to Unfogged.
William H. Sandholm, Population Games and Evolutionary Dynamics
A readable textbook on evolutionary game theory. It's pretty much entirely devoted to mathematical methods for finding equilibria and deducing long-run dynamics, as opposed to substantive results about particular games (or even classes of games). The mathematical background is explained extensively, and well, in a series of chapter appendices, amounting to maybe a quarter of the text.
By "population game", Sandholm means one in which large numbers of agents all play simultaneously, and all agents making the same move receive the same payoff, which is solely a function of the current distribution of moves over players. Agents then update their strategies in some way which depends on what they did, on the pay-off, and perhaps on how many others played various different moves and their pay-offs. These "revision protocols" give rise to different evolutionary dynamics, but all ones which are Markov processes. Over limited stretches of time, these approximate the ordinary differential equations one gets from looking at the expected rates of change in strategy frequencies, with the approximation getting closer and closer as the population grows. Understanding the limiting behavior over indefinitely long stretches of time is trickier, since various limits (e.g., large population vs. low noise) do not necessarily yield the same predictions.
For the most part, Sandholm limits himself to revision protocols which have various reasonable properties, like continuity in the population distribution, or not requiring too much information of the agents. (The book pays no attention to empirical evidence about how human beings or other animals act in strategic or repeated-choice situations.) But he also has (what seems to me to be) a mildly perverse interest in revision protocols which will converge on Nash equilibria, not because they are plausible but, as nearly as I can tell, because this lets evolutionary and classical game theorists live in peace in the same economics department.
If this isn't already the economists' standard textbook on evolutionary game theory, it ought to be.
ETA: I really hope this is a different William H. Sandholm.
Gene Wolfe, Citadel of the Autarch
The end of the Book of the New Sun (previously: 1, 2, 3). I find that I had retained the bare outlines of the story from when I read it as a boy, but I must have appreciated almost nothing more than the story, and the sense of a very strange and very old, worn-out world. (For instance, the concrete symbols, the parallels, and the parodic inversions of Wolfe's Catholicism must have gone right over my head...) Having finished it, I continue to wonder at the sense of unexplained-but-explicable mysteries that Wolfe created (*), and to be unsure whether it would be possible to solve them by careful study of the books, or whether only Wolfe knows what he had in mind, or whether he merely aimed for that very effect and had no definite answers. (The first option seems too Protestant, too sola scriptura, somehow.)
*: For instance, is "Behind our efforts, let there be found our efforts" supposed to echo with the way the last chapter says that behind this Severian, there is another Severian?
Noelle Stevenson, Nimona
Greg Rucka and Matthew Southworth, Stumptown: The Case of the Baby in the Velvet Case
Comic-book mind candy. (Previously for Stumptown.)
Lauren Willig, The Lure of the Moonflower
Mind candy. I am surprised how sad I am to see this series end. Once again, Willig does a good job of taking characters who had been merely stock figures in previous books and turning them into people, while preserving continuity with those earlier books.
Sarah Lotz, The Three
Mind candy: This is nicely creepy, but it goes rather off the rails in the last part, where Lotz tries to go from localized weirdness to whole countries (and, by implication, the world) heading to hell in hand baskets. (Chfuvat gur HF vagb gurbpenpl naq Wncna vagb erivivat gur Terngre Rnfg Nfvna Pb-Cebfcrevgl Fcurer vf n ybg gb nfx bs guerr jrveq xvqf.) I do like, however, that she never actually explains what happened. Zl thrff, onfrq ba gur irel ynfg yvarf, vf gung gur Guerr ner va n ebyr-cynlvat tnzr, jvgu rirelbar ryfr orvat na ACP, creuncf va n fvzhyngvba.
Lois McMaster Bujold, Pensic's Demon
Minor Bujold, but still Bujold, which is to say this novella leaves me wanting to read more adorable adventures of Pensic and Desdemona. (For instance, jung jvyy Cra'f ernpgvba or jura ur naq Qrf ner va n ebznapr-abiry cybg?)
Joe Abercrombie, Half a War
Mind candy: conclusion to Abercrombie's Viking-ish trilogy (previously), and just as compulsively readable. There are some "Nooo!" moments (particularly for readers of previous books), and lots of bloodshed, brutality and betrayal (as I said: Viking-ish), but he pulled off an ending which does not show every hope as false or futile, which is triumph enough for his worlds.
ROT-13'd for spoilers: 1. Guvf obbx nyfb pbasvezf fbzrguvat V'q fhfcrpgrq fvapr gur ynfg bar, anzryl gung gur jbeyq bs gur Funggrerq Frn vf gur erzbgr nsgrezngu bs na heona, grpuabybtvpny pvivyvmngvba oybjvat vgfrys hc — vaqrrq vg frrzf irel yvxryl gung jr ner gur ryirf. 2. V nqzvg V thrffrq jebat nobhg gur vqragvgl bs gur genvgbe; V'z fgvyy abg fher gung vg ernyyl svgf jvgu jung'f orra rfgnoyvfurq bs Sngure Lneiv'f punenpgre naq qrrc phaavat.
Corinna Sara Bechko and Gabriel Hardman, Heathentown
Mind candy: seeing something nasty in a central Florida graveyard. Promising material, but somehow it never came together for me; it may work better for others.
G. R. Grimmett, Probability on Graphs: Random Processes on Graphs and Lattices [Book preprint]
Dense but very rich; it presumes no prior acquaintance with graph theory or spatial stochastic processes, but a very good grasp on measure-theoretic probability, and a lot of mathematical maturity. The first few chapters build up gradually from an opener on electrical circuits (!) to random spanning trees, self-avoiding random walks, "influence" theorems and phase transitions, percolation theory, and random cluster models. (I must at this point confess that I'd never got the point of random cluster models before.) Thereafter things become a bit more miscellaneous, touring the Ising model, the "contact" model of stochastic epidemics, other interacting particle systems, random graphs, and, finally, the Lorentz gas. The perspective is very much that of a pure probabilist, though mention is made of applications to, or non-rigorous results from, physics and statistics.
Lisa Jardine, Going Dutch: How England Plundered Holland's Glory
The subtitle promises a lot more than Jardine delivers, which is instead a series of more-or-less interesting but only slightly connected anecedotes about Anglo-Dutch high politics and cultural interchange in the 17th century. Since the century ended with the Netherlands conquering Britain, but somehow not turning it into a permanent dependency, I'd really like to read a much more systematic and analytical account.
Patrick Weekes, The Palace Job and The Prophecy Con
Very fluffy mind candy: heists in fantasyland. I'm not sure they'd have worked in any reading environment other than trans-continental airplane flights, but they did.
Patrick O'Brian, Blue at the Mizzen
I had resisted reading the last of the Aubrey-Maturin novels until now. Having done so, I'm not at all sure how I feel about it, because it is so obviously the opening to a new cycle of novels, which were never written.

Update, next day: added a link to Simon's comment on "continuity of approximation", and deleted an excessive "very". 4 September: replaced Simon link with one which should work outside CMU, fixed an embarrassing typo.

Posted at August 31, 2015 23:59 | permanent link

### Course Announcement: 36-401, Modern Regression, Fall 2015

For the first time, I will be teaching a section of the course which is the pre-requisite for my spring advanced data analysis class. This is an introduction to linear regression modeling for our third-year undergrads, and others from related majors; my section is currently eighty students. Course materials, if you have some perverse desire to read them, will be posted on the class homepage twice a week.

This course is the first one in our undergraduate sequence where the students have to bring together probability, statistical theory, and analysis of actual data. I have mixed feelings about doing this through linear models. On the one hand, my experience of applied problems is that there are really very few situations where the "usual" linear model assumptions can be maintained in good conscience. On the other hand, I suspect it is usually easier to teach people the more general ideas if they've thoroughly learned a concrete special case first; and, perhaps more importantly, whatever the merits of (e.g.) Box-Cox transformations might actually be, it's the sort of thing people will expect statistics majors to know...

Addendum, later that night: I should have made it clear in the first place that my syllabus is, up through the second exam, ripped off borrowed with gratitude from Rebecca Nugent, who has taught 401 outstandingly for many years.

Update, since people have asked for it, links here (see the course page for the source files for lectures):

As post-mortems, some thoughts on the textbook and alternatives, and general [[lessons learned]].

Posted at August 31, 2015 13:52 | permanent link

## August 04, 2015

### Experimental Considerations Touching on the Art of Winning Lotteries

Attention conservation notice: Facile moral philosophy, loosely tied to experimental sociology.

Via I forget who, Darius Kazemi explaining "How I Won the Lottery". The whole thing absolutely must be watched from beginning to end.

Kazemi is, of course, absolutely correct in every particular. What he says in his talk about art goes also for science and scholarship. Effort, ability, networking — these can, maybe, get you more tickets. But success is, ultimately, chance.

I say this not just because it resonates with my personal experience, but because of actual experimental evidence. In a series of very ingenious experiments, Matthew Salganik, Peter Dodds and Duncan Watts have constructed "artificial cultural markets" — music download sites where they could manipulate how (if at all) previous consumers' choices fed into the choices of those who came later. In one setting, for example, people saw songs listed in order of decreasing popularity, but when you came to the website you were randomly assigned to one of a number of sub-populations, and you only saw popularity within your sub-population. Simplifying somewhat (read the papers!), what Salganik et al. showed is that while there is some correlation in popularity across the different experimental sub-populations, it is quite weak. Moreover, as in the real world, the distribution of popularity is ridiculously heavy tailed (and skewed to the right): the same song can end up dominating the charts or just scraping by, depending entirely on accidents of chance (or experimental design).

In other words: lottery tickets.

If one has been successful, it is very tempting to think that one deserves it, that this is somehow reward for merit, that one is somehow better than those who did not succeed and were not rewarded. The moral to take from Kazemi, and from Salganik et al., is that while those who have won the lottery are more likely to have done something to get multiple tickets than those who haven't, they are intrinsically no better than many losers. How, then, those who find themselves holding winning tickets should act is another matter, but at the least they oughtn't to delude themselves about the source of their good fortune.

Posted at August 04, 2015 23:11 | permanent link

## July 31, 2015

### Books to Read While the Algae Grow in Your Fur, July 2015

Attention conservation notice: I have no taste.

Marc Levinson, The Box: How the Shipping Container Made the World Smaller and the World Economy Bigger
This is by now a contemporary classic, which I should have read years ago. To enjoy it, you need to like geeking out over designing steel boxes; the culture of longshore work, the politics of their unions, and their (totally correct) fears of technological obsolescence; why container ports have economies of scale; and a dozen other things that usually lurk in the background of our world. If you read this weblog, it's probably right up your alley.
Further commentary is outsourced to Steve Laniel.
Hendrik Spruyt, The Sovereign State and Its Competitors: An Analysis of Systems Change
This is one of the few genuinely-evolutionary ventures in social science I've ever run across. Spruyt's aim, as his title suggests, is to explain how Europe came to be dominated into sovereign territorial states, which subsequently imposed that some mode of organization on the rest of the world. He wants a genuinely selectionist explanation, which he realizes means he needs to explain why such states survived, or tended to survive, while other, contemporary forms of polity did not. And he realizes that there were alternative forms of polity: not just feudalism, but also city-states (as in Italy) and city-leagues (as in the north), which were, for a time, serious contenders. Spruyt is very sound on how the causes which led to the formation of any of these polities need not be, and generally aren't, the same as the causes of their ultimate selection. It's very nice to see such a mass of historical detail intelligently organized and brought to bear on an interesting theoretical problem.
Being me, naturally I have some qualms or quibbles. (1) Spruyt essentially looks at three case studies: the French kingdom, the Hanseatic League, and the city-states of northern Italy. But his account, if valid, should generalize to at least the rest of Europe; I'd really like to see whether it does. (2) As a methodological point, the number of polities involved is very small, even if we go down to treating every city in the low countries or Tuscany as a distinct unit of selection. On general grounds of evolutionary theory, then, we should expect noise effects to be quite large relative to fitness differences, which in turn will make it hard to learn those differences. In other words, with so few kingdoms, city leagues, etc., to examine, I worry that Spruyt may just be creating narratives to retrospectively match mere chance. (The thought experiment here would be something like: in the alternate history which followed the same path as ours up to, say, 1450, but thereafter city leagues came to dominate western Europe, how hard would it be for alternate-Spruyt to assemble the split evidence into a case for the selective superiority of leagues, over sovereign territorial states?) (3) A lot of Spruyt's argument for why territorial states did better than city leagues is that the later lacked a central locus of authority which could credibly negotiate with outsiders, and make agreements stick by imposing them on the constituent cities. So why did no one invent the idea of a league where the league itself was the sovereign? Or was it just that when they did, they called it the United Provinces, and they happened to form a contiguous territory? (4) Spruyt takes the rather odd position that variation and selection are two temporally successive phases of an evolutionary process, rather than just being logically and causally distinct. (This idea seems to arise from a rather forced-sounding interpretation of Stephen Jay Gould's papers on punctuated equilibrium.) This is, I think, both wrong as a matter of general evolutionary theory, and superfluous to his own actual argument. (5) The opening chapters spill much too much ink on very parochial internal debates of the international relations sub-sub-discipline, giving little sense of its wider relevance to social science.
(Thanks to Henry Farrell for pointing me at this.)
Kameron Hurley, The Mirror Empire
Hurley's earlier science fiction novels (1, 2) were enjoyable mind candy, but this is great mind candy: world-building in which the human, the fantastic, and the all-too-human mingle; multiple realms of fantastic weirdness; compelling characters; and truly epic scope to the action. It deserves much more intelligent appreciation, but I am still too caught up in the story to provide one. I am very impatient to read the sequels.
The one thing I will raise as a criticism is that I am pretty sure in twenty years the gender politics here will look as dated as those in, say, The Forever War do now. On the other hand, I will not be surprised if people are still reading this in twenty years; and on the prehensile tail, I understand why Hurley hit those notes so hard.
Charles Stross, The Annihilation Score
Latest installment in the series beginning with The Atrocity Archives, in which British secret agents try to deal with the Cthulhu Mythos and modern management. I doubt it's really that follow-able if you've not kept up with the series (though I think Stross intends it as an alternate entry point), so I will cheerfully spoil earlier books in the rest of this comment. Previous volumes, through The Rhesus Chart, have been narrated by IT-staffer Bob; this one by his wife and fellow spook Mo. As we know from The Jennifer Morgue, archetypically, Bob is a Bond girl; Mo is Bond. In this book, Mo is Bond going through a marital collapse, a mid-life crisis, and a nervous breakdown a bit of a rough patch, so her superiors respond by putting her in charge of a new department managing superheroes (= otherwise-innocent bystanders developing sanity- and/or brain- eating magical powers as the Stars Become Right). Hijinks ensue, for rather soul-destroying values of hijinks; also, she fights crime.
Mo, as narrator, sounds a bit too much like Bob (for instance, too many IT allusions, and none arising from music or from epistemology). But otherwise, it's only too convincing as portrait of a marriage collapsing; I have more quibbles with the plot. ( Univat rirelguvat or n snyfr-synt bcrengvba ol gur cbyvpr fdhnerf bqqyl jvgu gur gvzvat bs gur svefg vapvqrag naq vgf crecrgengbe'f qrngu; naq Zb zhfg'ir orra uvg jvgu n ovt vqvbg fgvpx gb znxr ab pbaarpgvba orgjrra ure vafgehzrag naq gur ivyynvaf fgrnyvat rfbgrevp zhfvpny fpberf.) On balance, while I read it in as close to one sitting as I could, I still feel it's below the peak of the series.
Danielle S. Allen, Our Declaration: A Reading of the Declaration of Independence in Defense of Equality
An attempt to argue, as the sub-title says, that the Declaration is at least as much about equality as it is about freedom, and indeed about equality as the grounds of freedom. I like it very much, and it is very persuasive; it makes me feel better about our country. But the big point of doubt I have is that lots of what Allen points to seems to really be about a republican form of government, or even what Bagehot called "government by discussion", which is perfectly compatible with vast degrees of in-equality.
ObLinkage: the Crooked Timber symposium of Our Declaration.
Disclaimer: I know Prof. Allen, and have participated in a series of workshops she organized and contributed to a book she edited, but I feel under no obligation to write a positive notice of her books.
Richard D. Mattuck, A Guide to Feynman Diagrams in the Many-Body Problem
I began this one twenty years ago in graduate school, and cannot for the life of me recall why I didn't finish it at once. (I was young, foolish, easily misled...) It's best described in the words it uses for one of its own examples: "a pedagogically ideal illustration of the qualities which made the graphical method famous: its power to do perturbation theory to infinite order (thus enabling it to cope with strong couplings beyond the reach of ordinary perturbation procedures), its highly systematic and so-called 'automatic' character, its vivid pictorial appeal, and its remarkable talent for producing results valid outside their region of convergence" (p. 276). It does presume good knowledge of quantum mechanics and statistical mechanics, but no quantum field theory is necessary, nor even, I think, precise recall of classical E&M.
— There must be a general account of when, and why, Feynman diagrams work for arbitrary Markov processes, and/or other situations where a probability density obeys a nice differential equation. Where is it? (This is a start.)
What is this I don't even?

Posted at July 31, 2015 23:59 | permanent link

## June 30, 2015

### Books to Read While the Algae Grow in Your Fur, June 2015

Attention conservation notice: I have no taste.

Walter Jon Williams, Brig of War
Mind candy historical adventure fiction: a tale of derring-do and angst in the nascent American navy during the war of 1812. It was written before Williams turned to science fiction, but in retrospect the seeds of a lot of his later concerns can be discerned here. In particular, the way the viewpoint protagonist is at once deeply embedded in an institution, indeed commits his life to it, and also an emotionally detached observer of that institution, will recur in many later books — I think Favian would have interesting conversations with Dagmar, Aiah or Martinez.
— No purchase link, since this is long out of print, but readily available from all the electronic book sellers.
(This is the only historical novel I know of which is set during the Napoleonic Wars, written by an American, and yet does not side with the British Empire. This partiality towards, if not wholehearted embrace of, the very system of global conquest, plunder and tyranny against which we fought the Revolution — the one which burnt Washington! — is astonishing. While I am reluctant to question the patriotism of our historical novelists, is any other conclusion available to the candid mind?)
Hilary Mantel, Wolf Hall
Mind candy: literary, historical competence porn*. Praise on my part is superfluous. Thanks to CM and TC for persuading me to start reading it, and for providing the term "competence porn".
*: "His speech is low and rapid, his manner assured; he is at home in courtroom or waterfront, bishop's palace or inn yard. He can draft a contract, train a falcon, draw a map, stop a street fight, furnish a house and fix a jury. He will quote you a nice point in the old authors, from Plato to Plautus and back again. He knows new poetry, and can say it in Italian. He works all hours, first up and last to bed. He makes money and he spends it. He will take a bet on anything."
Lászlo Györfi, Michael Kohler, Adam Krzyzak and Harro Walk, A Distribution-Free Theory of Nonparametric Regression
I can't remember having read a better, more comprehensive, clearer, volume on the theory of nonparametric regression. It is magnificently unconcerned with the practicalities of applied statistics, but rather relentlessly focused on determining what we can learn about conditional expectation functions, and how fast, when we assume basically nothing about those functions, other than that they are well-defined and we get IID data. (In the last chapters, it even allows for dependent data.) The coverage is largely organized around different sorts of models (kernel smoothing, histograms, regression trees, local polynomials, splines, orthogonal series expansions...), typically beginning by defining the model, considering the model class's expressive or approximative powers, and then looking at how quickly it will converge on the true regression function under various smoothness assumptions on the latter. Classical minimax theory is used to establish that smoother functions (e.g., those with many continuous derivative of low magnitude) can be learned more quickly than rougher functions, but naively, we'd seem to need to know how smooth the true function is in order to achieve these fast rates. Particularly nice models are "adaptive", they will automatically adjust to the data and learn almost as quickly as if they knew in advance how smooth the target was. Accordingly, a lot of space is given to looking at which methods are adaptive; many otherwise nice models don't adapt very well. Chapters on topics like minimax theory and empirical process theory break up the development of the models, introducing mathematical tools and general ideas as needed. Two chapters on cross-validation and data-splitting are particularly nice: everyone uses them, because they work, but there is surprisingly little theory about such important tools, and the results here are really quite illuminating.
In principle, all this book requires is a good grasp of probability theory and the math that goes along with it. Some of the proofs involve lengthy calculations, but none are tricky or mathematically deep, because they don't need to be. More realistically, I'd suggest some prior experience both with actually running non-parametric regressions (at the level of, say, Elements of Statistical Learning Theory), and with the characteristic concerns of non-parametric theory (say, All of Nonparametric Statistics, or Tsybakov). All of the major classes of regression models in common use around 2000 are included — and that includes all the models in common use today, except Gaussian processes. Serious statistical theorists interested in regression have already read the book; I recommend it for those into methodology or even applications, because it's very well done and it gives them a sense of what lies in the background.
(Thanks to Ryan T. for persuading me to not just browse this, as I'd been doing for a decade, but actually read it systematically.)
Stephen King, Finders Keepers
Mind candy: sequel to Mr. Mercedes, but enjoyable independently. This is because while some characters from that book are the nominal heroes here, the really central characters are new — an old thief and murderer, and an idealistic teenage boy, both, in different ways, the biggest fans of an (imaginary) mid-century American novelist who seems to interpolate between John Updike, J. D. Salinger and Henry Roth; the story is really about their rivalry for the manuscripts of his unpublished Great American Novels.
Patrick O'Brian, The Hundred Days
Ken Liu, The Grace of Kings
Mind candy: fantasy novel, based on the rise of the Han dynasty, with added squabbling gods, "silk-punk" technology, and glancing blows at patriarchy. I picked it up because of the quality of Liu's translation of The Three-Body Problem; I'd read the nigh-inevitable sequel.
John Sutton, Sunk Costs and Market Structure: Price Competition, Advertising, and the Evolution of Concentration
In this book, Sutton is looking at what determines the level of concentration in industries with fixed (set-up) costs, hence increasing returns and imperfect competition, and where advertising works, in the sense that by spending money on ads, firms can increase their sales at a given price. This tends to lead to concentrated markets, where a small number of firms capture a large proportion of sales. So far, so standard industrial organization. What sets Sutton's approach apart, and makes it really distinctive, is that Sutton realizes the equilibria of reasonable models of entry, pricing and advertising decisions are incredibly sensitive to model details, but there are inequalities which hold across very wide range of models. (He went on to elaborate on this in Technology and Market Structure, and give a programmatic statement in Marshall's Tendencies.) Specifically, for any given size of the market, he can put a lower bound on the degree of concentration (at equilibrium). The fixed costs of entry mean that this lower bound initially decreases with the size of the market. (The market has to be at least so big to pay back the cost of establishing multiple rival plants.) But if advertising is effective, after a certain point the lower bound actually increases in market size — it becomes advantageous for firms to ramp up the sunk costs of entering the market through intensive advertising.
While Sutton goes through some (comparatively) conventional econometric exercises to do things like estimate the lower bound on concentration as a function of the size of the market, the bulk of this book is taken up by wonderfully detailed qualitative applications of his theory to the evolution of concentration and corporate strategy in a wide range of food industries across the six largest industrial economies. This is somewhat dated, having been written in the 1980s, but still fascinating, for an admittedly-nerdy value of fascination. Even if you don't think you care about the comparative industrial organization of breakfast-cereal manufacturing, it's still a virtuoso performance in melding social-scientific theory with concrete history.
Charles Stross, Saturn's Children
Mind candy: it's hard out there for a fembot, especially when she was designed to be an "escort" for human males and humanity, and every other eukaryote, has been extinct for centuries. There are a lot of science-fictional in-jokes (e.g., the Scalzi museum of paleontology on Mars), and some of the revelations were things I got long before the protagonist did. (But maybe the reader was supposed to?) Overall, though, it works much better as a story in its own right than anything deliberately riffing off the later works of Robert Heinlein has any right to do.

Posted at June 30, 2015 23:59 | permanent link

## June 12, 2015

### In Memoriam Zalmai Shalizi

One summer, when I was a boy, Uncle Zalo tried to teach me to shoot up at his ranch in the New Mexico high country. I was dismal, and I'm pretty sure the phrase "broad side of a barn" crossed his mind. It never crossed his lips, and he was never less than patient and encouraging but honest.

I had picked out new epic fantasy novels to bring him the next time I came to Santa Fe, and I wanted to talk with him about The Eternal Sky. I miss him.

Anything else I might have to say was already better said in his obituary.

Posted at June 12, 2015 23:59 | permanent link

## May 31, 2015

### Books to Read While the Algae Grow in Your Fur, May 2015

Attention conservation notice: I have no taste.

Cixin Liu, The Three-Body Problem (translated by Ken Liu [no relation])
A really remarkably engrossing novel of first contact. (I will refer you to James Nicoll for plot summary.) As a novel of first contact, I think it bears comparison to some of the classics, like War of the Worlds and His Master's Voice: it realizes that aliens will be alien, and that however transformative contact might be, people will continue to be human, and to react in human ways.
— It has a lot more affinities with Wolf Totem than I would have guessed --- both a recognizably similar mode of narration, and, oddly, some of the content — educated youths rusticated to Inner Mongolia during the Cultural Revolution, environmental degradation there, and nascent environmentalism. Three-Body Problem works these into something less immediately moving, but perhaps ultimately much grimmer, than Wolf Totem. I say "perhaps" because there are sequels, coming out in translations, which I very eagerly look forward to.
Elif Shafak, The Architect's Apprentice
Historical fiction, centered on the great Ottoman architect Sinan, but told from the viewpoint of one of his apprentices. I am sure that I missed a lot of subtleties, and I half-suspect that there are allusions to current Turkish concerns which are completely over my head. (E.g., the recurrence of squatters crowding into Istanbul from the country-side seems like it might mean something...) Nonetheless, I enjoyed it a lot as high-class mind candy, and will look for more from Shafak.
ROT-13'd for spoilers: Ohg jung ba Rnegu jnf hc jvgu gur fhqqra irre vagb snagnfl --- pbagntvbhf phefrf bs vzzbegnyvgl, ab yrff! --- ng gur raq?
Barry Eichengreen, Hall of Mirrors: The Great Depression, The Great Recession, and the Uses — and Misuses — of History [Author's book site]
What it says on the label: a parallel history of the Great Depression and the Great Recession, especially in the US, and of how historical memories (including historical memories recounted as economic theories) of the former shaped the response to the latter.
If anyone actually believed in conservatism, a conservative paraphrase of Eichengreen would run something as follows: back in the day, when our ancestors came face to face with the consequences of market economies run amok, our forefathers (and foremothers) created, through a process of pragmatic trial and error, a set of institutions which allowed for an unprecedented period of stable and shared prosperity. Eventually, however, there arose an improvident generation (mine, and my parents') with no respect for the wisdom of its ancestors, enthralled by abstract theories, a priori ideologies, and Utopian social engineering, which systematically dismantled or subverted those institutions. In the fullness of time, they reaped what they had sown, namely a crisis, and a series of self-inflicted economic would, which had no precedent for fully eighty years. Enough of the ancestors' works remained intact that the results were merely awful, however, rather than the sort of utter disaster which could lead to substantial reform, or reconsideration of ideas. And here we are.
(Thanks to IB and ZMS for a copy of this.)
David Danks, Unifying the Mind: Cognitive Representations as Graphical Models
This book may have the most Carnegie Mellon-ish title ever.
Danks's program in this book is to argue that large chunks of cognitive psychology might be unified not by employing a common mental process, or kind of process, but because they use the same representations, which take the form of (mostly) directed acyclic graphical models, a.k.a. graphical causal models. In particular, he suggests that representations of this form (i) give a natural solution to the "frame problem" and other problems of determining relevance, (ii) could be shared across very different sorts of processes, and (iii) make many otherwise puzzling isolated results into natural consequences. The three domains he looks at in detail are causal cognition (*), concept formation and application, and decision-making, with hopes that this sort of representation might apply elsewhere. Danks does not attempt any very direct mapping of the relevant graphical models on to the aspects of neural activity we can currently record; this strikes me as wise, given how little we know about psychology today, and how crude our measurements of brain activity are.
Disclaimer: Danks is a faculty colleague at CMU, I know him slightly, and he has worked closely with several friends of mine (e.g.). It would have been rather awkward for me to write a very negative review of his book, but not awkward at all to have not reviewed it in the first place.
*: Interestingly to me, Danks takes it for granted that we (a) have immediate perceptions of causal relations, which (b) are highly fallible, and (c) in any case conform so poorly to the rules of proper causal models that we shouldn't try to account for them with graphical models. I wish the book had elaborated on this, or at least on (a) and (c).
F. Gregory Ashby, Statistical Analysis of fMRI Data
This is another textbook introduction, like Poldrack, Mumford and Nichols, so I'll describe it by contrast. Ashby gives very little space to actual data acquisition and pre-processing; he's mostly about what you do once you've got your data loaded into Matlab. (To be fair, this book apparently began as the text for one of two linked classes, and the other covered the earlier parts of the pipeline.) The implied reader is, evidently, a psychologist, who knows linear regression and ANOVA (and remembers there's a some sort of link between them), and has a truly unholy obsession with testing whether particular coefficients are exactly zero. (I cannot recall a single confidence interval, or even a standard error, in the whole book.) Naturally enough, this makes voxel-wise linear models the main pillars of Ashby's intellectual structure. This also explains why he justifies removing artifacts, cleaning out systematic noise, etc., not as avoiding substantive errors, but as making one's results "more significant". (I suspect this is a sound reflection of the incentives facing his readers.) To be fair, he does give very detailed presentations of the multiple-testing problem, and even ventures into Fourier analysis to look at "coherence" (roughly, the correlation between two time series at particular frequencies), Granger causality, and principle and independent component analysis [1].
This implied reader is OK with algebra and some algebraic manipulations, but needs to have their hand held a lot. Which is fine. What is less fine are the definite errors which Ashby makes. Two particularly bugged me:
1. "The Sidak and Bonferroni corrections are useful only if the tests are all statistically independent" (p. 130): This is true of the Sidak correction but not of the Bonferroni, which allows arbitrary dependency between the tests. This mistake was not a passing glitch on that one page, but appears throughout the chapter on multiple testing, and I believe elsewhere.
2. Chapter 10 repeatedly asserts that PCA assumes a multivariate normal distribution for the data. (This shows up again in chapter 11, by way of a contrast with ICA.) This is quite wrong; PCA can be applied so long as covariances exist. The key proposition 10.1 on p. 248 is true as stated, but it would still be true if all instances of "multivariate normal" were struck out, and all instances of "independent" were replaced with "uncorrelated". This is related to the key, distribution-free result, not even hinted at by Ashby, that the first $k$ principal components give the $k$-dimensional linear space which comes closest on average to the data points. Further, if one does assume the data came from a multivariate normal distribution, then the principle components are estimates of the eigenvectors of the distribution's covariance matrix, and so one is doing statistical inference after all, contrary to the assertion that PCA involves no statistical inference. (More than you'd ever want to know about all this.) [2]
The discussion of Granger causality is more conceptually confused than mathematically wrong. It's perfectly possible, contra p. 228, that activity in region $i$ causes activity in region $j$ and vice versa, even with "a definition of causality that includes direction"; they just need to both do so with a delay. How this would show up given the slow measurement resolution of fMRI is a tricky question, which Ashby doesn't notice. There is an even deeper logical flaw: if $i$ and $j$ are both being driven by a third source, which we haven't included, then $i$ might well help predict ("Granger cause") $j$. In fact, even if we include this third source $k$, but we measure it imperfectly, $i$ could still help us predict $j$, just because two noisy measurements are better than one [3]. Indeed, if $i$ causes $j$ but only through $k$, and the first two variables are measured noisily, we may easily get non-zero values for the "conditional Granger causality", as in Ashby's Figure 9.4. Astonishingly, Ashby actually gets this for his second worked example (p. 242), but it doesn't lead him to reconsider what, if anything, Granger causality tells us about actual causality.
While I cannot wholeheartedly recommend a book with such flaws, Ashby has obviously tried really hard to explain the customary practices of his tribe to its youth, in the simplest and most accessible possible terms. If you are part of the target audience, it's probably worth consulting, albeit with caution.
[1] Like everyone else, Ashby introduces ICA with the cocktail-party problem, but then makes it about separating speakers rather than conversations: "Speech signals produced by different people should be independent of each other" (p. 258). To be fair, I think we've all been to parties where people talk past each other without listening to a thing anyone else says, but I hope they're not typical of Ashby's own experiences.
[2] Of course, Ashby introduces PCA with a made-up example of two test scores being correlated and wanting to know if they measure the same general ability. Of course, Ashby concludes the example by saying that we can tell both tests do tap in to a common ability by their both being positively correlated with the first principal component. You can imagine my feelings.
[3] For the first case, say $X_i(t) = X_k(t) + \epsilon_i(t)$, $X_j(t) = X_k(t) + \epsilon_j(t)$, with the two noise terms $\epsilon_i, \epsilon_j$ independent, and $X_k(t)$ following some non-trivial dynamics, perhaps a moving average process. Then predicting $X_i(t+1)$ is essentially predicting $X_k(t+1)$ (and adding a little noise), and the history of $X_i$, $X_i(1:t)$, will generally contain strictly less information about $X_k(t+1)$ than will the combination of $X_i(1:t)$ and $X_j(1:t)$. For the second case, suppose we don't observe the $X$ variables, but $B=X+\eta$, with extra observational noise $\eta_t$ independent across $i$, $j$ and $k$. Then, again, conditioning on the history of $B_j$ will add information about $X_k(t+1)$, after conditioning on the history of $B_i$ and even the history of $B_k$.
Lauren Beukes, The Shining Girls
A time-traveling psycho killer (a literal murder hobo) and his haunted house versus talented and energetic ("shining") women of Chicago throughout the 20th century. I cannot decide if this is just a creepy, mildly feminist horror novel with good characterization and writing, or if Beukes is trying to say something very dark about how men suppress female ability (and, if so, whether she's wrong about us).

Posted at May 31, 2015 23:59 | permanent link

## May 22, 2015

### 36-402, Advanced Data Analysis, Spring 2015: Self-Evaluation and Lessons Learned

Attention conservation notice: 2000+ words of academic navel-gazing about teaching a weird class in an obscure subject at an unrepresentative school; also, no doubt, more complacent than it ought to be.

Once again, it's the brief period between submitting all the grades for 402 and the university releasing the student evaluations (for whatever they're worth), so time to think about what I did, what worked, what didn't, and what to do better.

My self-evaluation was that the class went decently, but very far from perfectly, and needs improvement in important areas. I think the subject matter is good, the arrangement is at least OK, and the textbook a good value for the price. Most importantly, the vast majority of the students appear to have learned a lot about stuff they would not have picked up without the class. Since my goal is not for the students to have fun0 but to challenge them to learn as much as possible, and assist them in doing so, I think the main objective was achieved, though not in ways which will make me beloved or even popular.

All that is much as it was in previous iterations of the class; the big changes from the last time I taught this were the assignments, using R Markdown, and the size of the class.

Writing (almost) all new assignments — ten homeworks and three exams — was good; it reduced cheating1 to negligible proportions2 and kept me interested in the material. It was also a lot more work, but I think it was worth it. Basing them on real papers, mostly but not exclusively from economics, seems to have gone over well, especially considering how many students were in the joint major in economics and statistics. (It also led to a gratifying number of students reporting crises of faith about what they were being taught in their classes in other departments.) Relatedly, having the technical content of each homework only add up to 90 points, with the remaining 10 being allocated for following a writing rubric3 seems to have led to better writing, easier grading, and I think more perception of fairness in the grading.

Encouraging the use of R Markdown so that the students' data analyses were executable and replicable was a very good call. (I have to thank Jerzy Wieczorek for over-coming my skepticism by showing me R Markdown.) In fact, I think it worked well enough that in the future I will make it mandatory, with a teaching session at the beginning of the semester (and exceptions, with permission in advance, for those who want to use knitr and LaTeX). However, I may have to reconsider my use of the np package for kernel regression, since it is very aggressive about printing out progress messages which are not useful in a report.

The big challenge of the class was sheer size. The first time I taught this class, in 2011, it had 63 students; we hit 120 this year. (And the department expects about 50% more next year.) This, of course, made it impossible to get to know most of the students — at best I got a sense of the ones who ere were regular at my office hours or spoke up in lecture, and those who sent me e-mail frequently. (Linking the faces of the former to the names of the latter remains one of my weak points.) It also means I would have gone crazy if it weren't for the very good TAs (Dena Asta, Collin Eubanks, Sangwon "Justin" Hyun and Natalie Klein), and the assistance of Xizhen Cai, acting as my (as it were) understudy — but coordinating six people for teaching is also not one of my strengths. Over the four months of the semester I sent over a thousand e-mails about the class, roughly three quarters to students and a quarter among the six of us; I feel strongly that there have to be more efficient ways of doing this part of my job.

The "quality control" samples — select six students at random every week, have them in for fifteen minutes or so to talk about what they did on the last assignment and anything that leads to, with a promise that their answers will not hurt their grades — continue to be really informative. In particular, I made a point of asking every student how long they spent on that assignment and on previous ones, and most (though not all) were within the university's norms for a nine-credit class. Some students resisted participation, perhaps because they didn't trust the wouldn't-hurt-their-grades bit; if so, I failed at "drive out fear". Also, it needs a better name, since the students keep thinking it's their quality that's being controlled, rather than that of the teaching and grading.

Things that did not work so well:

• I did not do enough to ensure consistency across graders, especially when it came to the depth of the feedback to students. Unfortunately, the only things I can think to improve this will use a lot of my time, but I'll just have to do them. Also, towards the end of the semester, we were slow in getting things graded, largely because of organizational failures on my part. Again, something I will just need to invest more time in doing it right.
• A vocal and numerically non-trivial minority of the students keep finding it hard to get the connections between lectures, the text, and the assignments. If they were just the ones who were obviously slacking or not getting it, I'd be more inclined to dismiss them, but some otherwise very good ones were in this group. This means that the way I'm teaching is not working for people I ought to be able to reach, and I need to somehow change it — while still serving the ones my current style works for.
• This is a distinct complaint from those who dislike not having an explicit example in the text or lecture to copy for each homework assignment. This is not something I plan to change.
• This is also distinct from the complaint about having to do too much programming. That also isn't going to go away, and statistical computing isn't going to become a pre-requisite (whether or not that would be a good idea...), so there needs to be more provision of support for that.
• Many students continue to have difficulty with "what does the model say would happen under situation $X$?" questions, especially when situation $X$ does not occur in the training data. The handout on predict seems to have helped, but not gone far enough. This needs to be made even more explicit, and perhaps a whole lecture given over to hypotheticals, to predictive comparisons, and to average predictive comparisons.
• Once again, I dropped the lowest three homework grades, no questions asked, and didn't give extensions; this is so I don't have to try to decide whether a grand-parent is sick enough, or a job interview demanding enough, or extra-curricular activities at Carnival are important enough, to merit an extension. The drawback is that this leads to lots of students not doing some of the last problem sets (especially the ones who have done well before that), and so being thrown when the final is cumulative.

Things I am considering trying next time:

• Setting up a website where students can ask, and answer, questions with persistent pseudonyms4, so they're not embarrassed to ask for help, and I don't have to repeat myself. (I actually looked into that this semester, but didn't find anything which didn't seem awful. Basically, what I want is a private Stack Exchange.) To prevent this degenerating into either a sewer or a forum for cheating, it will need moderation and monitoring, and perhaps need to be seeded with some planted questions, to encourage participation.
• I am not sure that setting take-home exams really accomplishes anything that wouldn't be done just as well by more homework assignments. Except that I can say they're not to collaborate on exams, and they (mostly, apparently) listen. I might, however, just make it one homework assignment a week, with three of them requiring the report format I've used for take-homes.
• Making participation in the quality-control sampling a small but non-zero part of the class grade, maybe 5% — full credit if you're either called up and do it, or never get called on, 0 if you refuse. (But maybe "a fine is a price" effects would then lead to less participation?)
• Include in each lecture (but not in the online notes?) a short question, to be answered by the next day, which is either conceptual or a tiny bit of theory, totaling say 5% of the grade. This should give me feedback on how well the lectures are working, and give them some feedback on how well they're actually understanding the ideas behind the methods. (The last thing I want to produce is people who just think they know which commands to type in R.)
• Consider moving the dependent-data lectures after graphical models but before causal inference, so as to end with the latter. I might also remove the new lecture on experimental design, because while it's a worthy subject it doesn't excite me, and it fits somewhat awkwardly with the others. (Perhaps I'm not reading the right stuff on experimental design.)
• Consider finding a replacement for the competition to find the most typos in the text.
• Consider promising to feed the class for the last lecture if the response rate on course evaluations goes above, say, 90%.

— Naturally, while proofing this before posting, the university e-mailed me the course evaluations. They were unsurprisingly bimodal.

[0] I have no objection to fun, or to fun classes, or even to students having fun in my classes; it's just not what I'm aiming at here. ^

[1] I am sorry to have to say that there are some students who have tried to cheat, by re-using old solutions. This is why I no longer put solutions on the public web, and part of why I made sure to write new assignments this time, or, if I did re-cycle, make substantial changes. ^

[2] At least, cheating that we caught. (I will not describe how we caught anyone.) ^

[3] This evolved a little over the semester; here's the final version.

The text is laid out cleanly, with clear divisions between problems and sub-problems. The writing itself is well-organized, free of grammatical and other mechanical errors, and easy to follow. Figures and tables are easy to read, with informative captions, axis labels and legends, and are placed near the text of the corresponding problems. All quantitative and mathematical claims are supported by appropriate derivations, included in the text, or calculations in code. Numerical results are reported to appropriate precision. Code is either properly integrated with a tool like R Markdown or knitr, or included as a separate R file. In the former case, both the knitted and the source file are included. In the latter case, the code is clearly divided into sections referring to particular problems. In either case, the code is indented, commented, and uses meaningful names. All code is relevant to the text; there are no dangling or useless commands. All parts of all problems are answered with actual coherent sentences, and never with raw computer code or its output. For full credit, all code runs, and the Markdown file knits (if applicable). ^

[4] The North American Mammals Paleofauna Database for homework 5 has about two thousand entries, so my thought would be to assign each student a random extinct species as their pseudonym. These should be socially neutral, and more memorable than numbers, but no doubt I'll discover that some students have profound feelings about the amphicyonidae. ^

Posted at May 22, 2015 19:34 | permanent link

## May 16, 2015

### Any P-Value Distinguishable from Zero is Insufficiently Informative


Attention conservation notice: 4900+ words, plus two (ugly) pictures and many equations, on a common mis-understanding in statistics. Veers wildly between baby stats. and advanced probability theory, without explaining either. Its efficacy at remedying the confusion it attacks has not been evaluated by a randomized controlled trial.

After ten years of teaching statistics, I feel pretty confident in saying that one of the hardest points to get through to undergrads is what "statistically significant" actually means. (The word doesn't help; "statistically detectable" or "statistically discernible" might've been better.) They have a persistent tendency to think that parameters which are significantly different from 0 matter, that ones which are insignificantly different from 0 don't matter, and that the smaller the p-value, the more important the parameter. Similarly, if one parameter is "significantly" larger than another, then they'll say the difference between them matters, but if not, not. If this was just about undergrads, I'd grumble over a beer with my colleagues and otherwise suck it up, but reading and refereeing for non-statistics journals shows me that many scientists in many fields are subject to exactly the same confusions as The Kids, and talking with friends in industry makes it plain that the same thing happens outside academia, even to "data scientists". (For example: an A/B test is just testing the difference in average response between condition A and condition B; this is a difference in parameters, usually a difference in means, and so it's subject to all the issues of hypothesis testing.) To be fair, one meets some statisticians who succumb to these confusions.

One reason for this, I think, is that we fail to teach well how, with enough data, any non-zero parameter or difference becomes statistically significant at arbitrarily small levels. The proverbial expression of this, due I believe to Andy Gelman, is that "the p-value is a measure of sample size". More exactly, a p-value generally runs together the size of the parameter, how well we can estimate the parameter, and the sample size. The p-value reflects how much information the data has about the parameter, and we can think of "information" as the product of sample size and precision (in the sense of inverse variance) of estimation, say $n/\sigma^2$. In some cases, this heuristic is actually exactly right, and what I just called "information" really is the Fisher information.

Rather than working on grant proposals Egged on by a friend As a public service, I've written up some notes on this. Throughout, I'm assuming that we're testing the hypothesis that a parameter, or vector of parameters, $\theta$ is exactly zero, since that's overwhelming what people calculate p-values for — sometimes, I think, by a spinal reflex not involving the frontal lobes. Testing $\theta=\theta_0$ for any other fixed $\theta_0$ would work much the same way. Also, $\langle x, y \rangle$ will mean the inner product between the two vectors.

#### 1. Any Non-Zero Mean Will Become Arbitrarily Significant

Let's start with a very simple example. Suppose we're testing whether some mean parameter $\mu$ is equal to zero or not. Being straightforward folk, who follow the lessons we were taught in our one room log-cabin schoolhouse research methods class, we'll use the sample mean $\hat{\mu}$ as our estimator, and take as our test statistic $\frac{\hat{\mu}}{\hat{\sigma}/\sqrt{n}}$; that denominator is the standard error of the mean. If we're really into old-fashioned recipes, we'll calculate our p-value by comparing this to a table of the $t$ distribution with $n-2$ degrees of freedom, remembering that it's $n-2$ because we're using one degree of freedom to get the mean estimate ($\hat{\mu}$) and another to get the standard deviation estimate ($\hat{\sigma}$). (If we're a bit more open to new-fangled notions, we bootstrap.) Now what happens as $n$ grows?

Well, we remember the central limit theorem: $\sqrt{n}(\hat{\mu} - \mu) \rightarrow \mathcal{N}(0,\sigma^2)$. With a little manipulation, and some abuse of notation, this becomes $\hat{\mu} \rightarrow \mu + \frac{\sigma}{\sqrt{n}}\mathcal{N}(0,1)$ The important point is that $\hat{\mu} = \mu + O(n^{-1/2})$. Similarly, albeit with more algebra, $\hat{\sigma} = \sigma + O(n^{-1/2})$. Now plug these in to our formula for the test statistic: $\begin{eqnarray*} \frac{\hat{\mu}}{\hat{\sigma}/\sqrt{n}} & = & \sqrt{n}\frac{\hat{\mu}}{\hat{\sigma}}\\ & = & \sqrt{n}\frac{\mu + O(n^{-1/2})}{\sigma + O(n^{-1/2})}\\ & = & \sqrt{n}\left(\frac{\mu}{\sigma} + O(n^{-1/2})\right)\\ & = & \sqrt{n}\frac{\mu}{\sigma} + O(1) \end{eqnarray*}$ So, as $n$ grows, the test statistic will go to either $+\infty$ or $-\infty$, at a rate of $\sqrt{n}$, unless $\mu=0$ exactly. If $\mu \neq 0$, then the test statistic eventually becomes arbitrarily large, while the distribution we use to calculate p-values stabilizes at a standard Gaussian distribution (since that's a $t$ distribution with infinitely many degrees of freedom). Hence the p-value will go to zero as $n\rightarrow \infty$, for any $\mu\neq 0$. The rate at which it does so depends on the true $\mu$, the true $\sigma$, and the number of samples. The p-value reflects how big the mean is ($\mu$), how precisely we can estimate it ($\sigma$), and our sample size ($n$).

 T-statistics calculated for five independent runs of Gaussian random variables with the specified parameters, plotted against sample size. Successive t-statistics along the same run are linked; the dashed lines are the asymptotic formulas, $\sqrt{n}\mu/\sigma$. Note that both axes are on a logarithmic scale. (Click on the image for a larger PDF version; source code.)

#### 2. Any Non-Zero Regression Coefficient Will Become Arbitrarily Significant

Matters are much the same if instead of estimating a mean we're estimating a difference in means, or regression coefficients, or linear combinations of regression coefficients ("contrasts"). The p-value we get runs together the size of the parameter, the precision with which we can estimate the parameter, and the sample size. Unless the parameter is exactly zero, as $n\rightarrow\infty$, the p-value will converge stochastically to zero.

Even if two parameters are estimated from the same number of samples, the one with a smaller p-value is not necessarily larger; it may just have been estimated more precisely. Let's suppose we're in the land of good, old-fashioned linear regression, where $Y = \langle X, \beta \rangle + \epsilon$, where all the random variables have mean 0 (to simplify book-keeping), where $\epsilon$ is uncorrelated with $X$. Estimating $\beta$ with ordinary least squares, we get of course $\hat{\beta} = (\mathbf{x}^T \mathbf{x})^{-1} \mathbf{x}^T \mathbf{y} ~,$ with $\mathbf{x}$ being the $n\times 2$ matrix of $X$ values and $\mathbf{y}$ the $n\times 1$ matrix of $Y$ values. Since $\mathbf{y} = \mathbf{x} \beta + \mathbf{\epsilon}$, $\hat{\beta} = \beta + (\mathbf{x}^T \mathbf{x})^{-1}\mathbf{x}^T \mathbf{\epsilon} ~.$ Assuming the $\epsilon$ terms are uncorrelated with each other and have constant variance $\sigma^2_{\epsilon}$, we get $\Var{\hat{\beta}} = \sigma^2_{\epsilon} (\mathbf{x}^T \mathbf{x})^{-1} ~.$ To understand what's really going on here, notice that $\frac{1}{n} \mathbf{x}^T \mathbf{x}$ is the sample variance-covariance matrix of $X$; call it $\hat{\mathbf{v}}$. (I give it a hat because it's an estimate of the population covariance matrix.) So $\Var{\hat{\beta}} = \frac{\sigma^2_{\epsilon}}{n}\hat{\mathbf{v}}^{-1}$ The standard errors for the different components of $\hat{\beta}$ are thus going to be the square roots of the diagonal entries of $\Var{\hat{\beta}}$. We will therefore estimate different regression coefficients to different precisions. To make a regression coefficient precise, the predictor variable it belongs to should have a lot of variance, and it should have little correlation with other predictor variables. (If we used an orthogonal design, $\hat{\mathbf{v}}^{-1/2}$ will be a diagonal matrix whose entries are the reciprocals of the regressors' standard deviations.) Even if we think that the size of entries in $\beta$ is telling us something about how important different $X$ variables are, one of them having a bigger variance than the other doesn't make it more important in any interesting sense.

#### 3. Consistent Hypothesis Tests Imply Everything Will Become Arbitrarily Significant

So far, I've talked about particular cases --- about estimating means or linear regression coefficients, and even using particular estimators. But the point can be made much more generally, though at some cost in abstraction. Recall that a hypothesis test can make two kinds of error: it can declare that there's some signal when it really looks at noise (a "false alarm" or "type I" error), or it can ignroe the presence of a signal and mistake it for noise (a "miss" or "type II" error). The probability of a false alarm, when looking at noise, is called the size of a test. The probability of noticing a signal when it is present is called the power to detect the signal. A hypothesis test is consistent if its size goes 0 and its power goes to 1 as the number of data points grows. (Purists would call this a consistent sequence of hypothesis tests, but I'm trying to speak like a human being.)

Suppose that a consistent hypothesis test exists. Then at each sample size $n$, there's a range of p-values $[0,a_n]$ where we reject the noise hypothesis and claim there's a signal, and another $(a_n,1]$ where we say there's noise. Since the p-value is uniformly distributed under the noise hypothesis, the size of the test is just $a_n$, so consistency means $a_n$ must go to 0. The power of the test is the probability, in the presence of signal, that the p-value is in the rejection region, i.e., $\Probwrt{\mathrm{signal}}{P \leq a_n}$. Since, by consistency, the power is going to 1, the probability (in the presence of signal) that the p-value is less than any given value eventually goes to 1. Hence the p-value converges stochastically to 0 (again, when there's a signal). Thus, if there is a consistent hypothesis test, and there is any signal to be detected at all, the p-value must shrink towards 0.

I bring this up because, of course, the situations where people usually want to calculate p-values are in fact the ones where there usually are consistent hypothesis tests. These are situations where we have an estimator $\hat{\theta}$ of the parameter $\theta$ which is itself "consistent", i.e., $\hat{\theta} \rightarrow \theta$ in probability as $n \rightarrow \infty$. This means that with enough data, the estimate $\hat{\theta}$ will come arbitrarily close to the truth, with as much probability as we might desire. It's not hard to believe that this will mean there's a consistent hypothesis test --- just reject the null when $\hat{\theta}$ is too far from 0 --- but the next two paragraphs sketch a proof, for the sake of skeptics and quibblers.

Consistency of estimation means that for any level of approximation $\epsilon > 0$ and any level of confidence $\delta > 0$, for all $n \geq$ some $N(\epsilon,\delta,\theta)$, $\Probwrt{\theta}{\left|\hat{\theta}_n-\theta\right|>\epsilon} \leq \delta ~.$ This can be inverted: for any $n$ and any $\delta$, for any $\eta \geq \epsilon(n,\delta,\theta)$, $\Probwrt{\theta}{\left|\hat{\theta}_n-\theta\right|>\eta} \leq \Probwrt{\theta}{\left|\hat{\theta}_n-\theta\right|>\epsilon(n,\delta,\theta)} \leq \delta ~.$ Moreover, as $n\rightarrow\infty$ with $\delta$ and $\theta$ held constant, $\epsilon(n,\delta,\theta) \rightarrow 0$.

Pick any $\theta^* \neq 0$, and any $\alpha$ and $\beta > 0$ that you like. For each $n$, set $\epsilon = \epsilon(n,\alpha,0)$; abbreviate this sequence as $\epsilon_n$. I will use $\hat{\theta}_n$ as my test statistic, retaining the null hypothesis $\theta=0$ when $\left|\hat{\theta}_n\right| \leq \epsilon_n$, and reject it otherwise. By construction, my false alarm rate is at most $\alpha$. What's my miss rate? Well, again by consistency of the estimator, for any sufficiently small but fixed $\eta > 0$, if $n \geq N(|\theta^*| - \eta, \beta, \theta^*)$, then $\Probwrt{\theta^*}{\left|\hat{\theta}_n\right| < \eta} \leq \Probwrt{\theta^*}{\left|\hat{\theta}_n - \theta^*\right|\geq |\theta^*| - \eta} \leq \beta ~.$ (To be very close to 0, $\hat{\theta}$ has to be far from $\theta^*$.) So, if I wait until $n$ is large enough that $n \geq N(|\theta^*| - \eta, \beta, \theta^*)$ and that $\epsilon_n \leq \eta$, my power against $\theta=\theta^*$ is at least $1-\beta$ (and my false-positive rate is still at most $\alpha$). Since you got to pick pick $\alpha$ and $\beta$ arbitrarily, you can make them as close to 0 as we like, and I can still get arbitrarily high power against any alternative while still controlling the false-positive rate. In fact, you can pick a sequence of error rate pairs $(\alpha_k, \beta_k)$, with both rates going to zero, and for $n$ sufficiently large, I will, eventually, have a size less thant $\alpha_k$, and a power against $\theta=\theta^*$ greater than $1-\beta_k$. Hence, a consistent estimator implies the existence of a consistent hypothesis test. (Pedantically, we have built a universally consistent test, i.e., consistent whatever the true value of $\theta$ might be, but not necessarily a uniformly consistent one, where the error rates can be bounded independent of the true $\theta$. The real difficulty there is that there are parameter values in the alternative hypothesis $\theta \neq 0$ which come arbitrarily close to the null hypothesis $\theta=0$, and so an arbitrarily large amount of information may be needed to separate them with the desired reliability.)

#### 4. $p$-Values for Means Should Shrink Exponentially Fast

So far, I've been arguing that the p-value should always go stochastically to zero as the sample size grows. In many situations, it's possible to be a bit more precise about how quickly it goes to zero. Again, start with the simple case of testing whether a mean is equal to zero. We saw that our test statistic $\hat{\mu}/(\hat{\sigma}/\sqrt{n}) \rightarrow \sqrt{n}\mu/\sigma + O(1)$, and that the distribution we compare this to approaches $\mathcal{N}(0,1)$. Since for a standard Gaussian $Z$ the probability that $Z > t$ is at most $\frac{\exp{\left\{-t^2/2\right\}}}{t\sqrt{2\pi}}$, the p-value in a two-sided test goes to zero exponentially fast in $n$, with the asymptotic exponential rate being $\frac{1}{2}\mu^2/\sigma^2$. Let's abbreviate the p-value after $n$ samples as $P_n$: $\begin{eqnarray*} P_n & = & \Prob{|Z| \geq \left|\frac{\hat{\mu}}{\hat{\sigma}/\sqrt{n}}\right|}\\ & = & 2 \Prob{Z \geq \left|\frac{\hat{\mu}}{\hat{\sigma}/\sqrt{n}}\right|}\\ & \leq & 2\frac{\exp{\left\{-n\hat{\mu}^2/2\hat{\sigma}^2\right\}}}{\sqrt{n}\hat{\mu}\sqrt{2\pi}/\hat{\sigma}}\\ \frac{1}{n}\log{P_n} & \leq & \frac{\log{2}}{n} -\frac{\hat{\mu}^2}{2\hat{\sigma}^2} - \frac{\log{n}}{2n} - \frac{1}{n}\log{\frac{\hat{\mu}}{\hat{\sigma}}} - \frac{\log{2\pi}}{n}\\ \lim_{n\rightarrow\infty}{\frac{1}{n}\log{P_n}} & \leq & -\frac{\mu^2}{2\sigma^2} \end{eqnarray*}$ Since $\Prob{Z > t}$ is also at least $\exp{\left\{-t^2/2\right\}}/(t^2+1)\sqrt{2\pi}$, a parallel argument gives a matching lower bound, $\lim_{n\rightarrow\infty}{n^{-1}\log{P_n}} \geq -\frac{1}{2}\mu^2/\sigma^2$.

 P-value versus sample size, color coded as in the previous figure. Notice that even the runs where $\mu$, and $\mu/\sigma$, are very small (in green), the p-value is declining exponentially. Again, click for a larger PDF, source code here.

#### 5. $p$-Values in General Will Often Shrink Exponentially Fast

This is not just a cute trick with Gaussian approximations; it generalizes through the magic of large deviations theory. Glossing over some technicalities, a sequence of random variables $X_1, X_2, \ldots X_n$ obey a large deviations principle when $\lim_{n\rightarrow\infty}{\frac{1}{n}\log{\Probwrt{}{X_n \in B}}} = -\inf_{x\in B}{D(x)}$ where $D(x) \geq 0$ is the "rate function". If the set $B$ doesn't include a point where $D(x)=0$, the probability of $B$ goes to zero, exponentially in $n$,* with the exact rate depending on the smallest attainable value of the rate function $D$ over $B$. ("Improbable events tend to happen in the most probable way possible.") Very roughly speaking, then, $\Probwrt{}{X_n \in B} \approx \exp{\left\{ - n \inf_{x\in B}{D(x)}\right\}}$. Suppose that $X_n$ is really some estimator of the parameter $\theta$, and it obeys a large deviations principle for every $\theta$. Then the rate function $D$ is really $D_{\theta}$. For consistent estimators, $D_{\theta}(x)$ would have a unique minimum at $x=\theta$. The usual estimators based on sample means, correlations, sample distributions, maximum likelihood, etc., all obey large deviations principles, at least under most of the conditions where we'd want to apply them.

Suppose we make a test based on this estimator. Under $\theta=\theta^*$, $X_n$ will eventually be within any arbitrarily small open ball $B_{\rho}$ of size $\rho$ around $\theta^*$ we care to name; the probability of its lying outside $B_{\rho}$ will be going to zero exponentially fast, with the rate being $\inf_{x\in B^c_{\rho}}{D_{\theta^*}(x)} > 0$. For small $\rho$ and smooth $D_{\theta^*}$, Taylor-expanding $D_\theta^*$ about its minimum suggests that rate will be $\inf_{\eta: \|\eta\| > \rho}{\frac{1}{2}\langle \eta, J_{\theta^*} \eta\rangle}$, $J_{\theta^*}$ being the matrix of $D$'s second derivatives at $\theta^*$. This, clearly, is $O(\rho^2)$.

The probability under $\theta = 0$ of seeing results $X_n$ lying inside $B_{\rho}$ is very different. If we've made $\rho$ small enough that $B_{\rho}$ doesn't include 0, $\Probwrt{0}{X_n \in B_{\rho}} \rightarrow 0$ exponentially fast, with rate $\inf_{x \in B_{\rho}}{D_0(x)}$. Again, if $\rho$ is small enough and $D_0$ is smooth enough, the value of the rate function should be essentially $D_0(\theta^*) + O(\rho^2)$. If $\theta^*$ in turn is close enough to 0 for a Taylor expansion, we'd get a rate of $\frac{1}{2}\langle \theta^*, J_0 \theta^*\rangle$. To repeat, this is the exponential rate at which the p-value is going to zero when we test $\theta=0$ vs. $\theta\neq 0$, and the alternative value $\theta^*$ is true. It is no accident that this is the same sort of rate we got for the simple Gaussian-mean problem.

Relating the matrix I'm calling $J$ to the Fisher information matrix $F$ needs a longer argument, which I'll present even more sketchily. The empirical distribution obeys a large deviations principle whose rate function is the Kullback-Leibler divergence, a.k.a. the relative entropy; this result is called "Sanov's theorem". For small perturbations of the parameter $\theta$, the divergence between a distribution at $\theta+\eta$ and that at $\theta$ is, after yet another Taylor expansion and a little algebra, $\langle \eta, F_{\theta} \eta \rangle$. A general result in large deviations theory, the "contraction principle", says that if the $X_n$ obey an LDP with rate function $D$, then $Y_n = h(X_n)$ obeys an LDP with rate function $D^{\prime}(y) = \inf_{x : h(x) = y}{D(x)}$. Thus an estimator which is a function of the empirical distribution, which is most of them, will have a decay rate which is at most $\langle \eta, F_{\theta} \eta \rangle$, and possibly less, if it the estimator is crude enough. (The maximum likelihood estimator in an exponential family will, however, preserve large deviation rates, because it's a sufficient statistic.)

#### 6. What's the Use of $p$-Values Then?

Much more limited than the bad old sort of research methods class (or third referee) would have you believe. If you find a small p-value, yay; you've got enough data, with precise enough measurement, to detect the effect you're looking for, or you're really unlucky. If your p-value is large, you're either really unlucky, or you don't have enough information (too few samples or too little precision), or the parameter is really close to zero. Getting a big p-value is not, by itself, very informative; even getting a small p-value has uncomfortable ambiguity. My advice would be to always supplement a p-value with a confidence set, which would help you tell apart "I can measure this parameter very precisely, and if it's not exactly 0 then it's at least very small" from "I have no idea what this parameter might be". Even if you've found a small p-value, I'd recommend looking at the confidence interval, since there's a difference between "this parameter is tiny, but really unlikely to be zero" and "I have no idea what this parameter might be, but can just barely rule out zero", and so on and so forth. Whether there are any scientific inferences you can draw from the p-value which you couldn't just as easily draw from the confidence set, I leave between you and your referees. What you definitely should not do is use the p-value as any kind of proxy for how important a parameter is.

If you want to know how much some variable matters for predictions of another variable, you are much better off just perturbing the first variable, plugging in to your model, and seeing how much the outcome changes. If you need a formal version of this, and don't have any particular size or distribution of perturbations in mind, then I strongly suggest using Gelman and Pardoe's "average predictive comparisons". If you want to know how much manipulating one variable will change another, then you're dealing with causal inference, but once you have a tolerable causal model, again you look at what happens when you perturb it. If what you really want to know is which variables you should include in your predictive model, the answer is the ones which actually help you predict, and this is why we have cross-validation (and have had it for as long as I've been alive), and, for the really cautious, completely separate validation sets. To get a sense of just how mis-leading p-values can be as a guide to which variables actually carry predictive information, I can hardly do better than Ward et al.'s "The Perils of Policy by p-Value", so I won't.

(I actually have a lot more use for p-values when doing goodness-of-fit testing, rather than as part of parametric estimation, though even there one has to carefully examine how the model fails to fit. But that's another story for another time.)

Nearly fifty years ago, R. R. Bahadur defined the efficiency of a test as the "rate at which it makes the null hypothesis more and more incredible as the sample size increases when a non-null distribution obtains", and gave a version of the large deviations argument to say that these rates should typically be exponential. The reason he could do so was that it was clear the p-value will always go to zero as we get more information, and so the issue is whether we're using that information effectively. In another fifty years, I presume that students will still have difficulties grasping this, but I piously hope that professionals will have absorbed the point.

References:

*: For the sake of completeness, I should add that sometimes we need to replace the $1/n$ scaling by $1/r(n)$ for some increasing function $r$, e.g., for dense graphs where $n$ counts the number of nodes, $r(n)$ would typically be $O(n^2)$. ^

(Thanks to KLK for discussions, and feedback on a draft.)

Update, 17 May 2015: Fixed typos (backwards inequality sign, errant $\theta$ for $\rho$) in large deviations section.

Posted at May 16, 2015 12:39 | permanent link

## May 12, 2015

### "The free development of each is the condition of the war of all against all": Some Paths to the True Knowledge

Attention conservation notice: A 5000+ word attempt to provide real ancestors and support for an imaginary ideology I don't actually accept, drawing on fields in which I am in no way an expert. Contains long quotations from even-longer-dead writers, reckless extrapolation from arcane scientific theories, and an unwarranted tone of patiently explaining harsh, basic truths. Altogether, academic in one of the worst senses. Also, spoilers for several of MacLeod's novels, notably but not just The Cassini Division. Written for, and cross-posted to, Crooked Timber's seminar on MacLeod, where I will not be reading the comments.

I'll let Ellen May Ngwethu, late of the Cassini Division, open things up:

What I want to consider here is how people who aren't inmates of a privatized gulag could come to the true knowledge, or something very like it; how they might use it; and some of how MacLeod makes it come alive.

## Their Morals and Ours

One route, of course, would be through the Marxist and especially the Trotskyist tradition; I suspect this was MacLeod's. In "Their Morals and Ours", Trotsky laid out a famous formulation of what really matters:

A means can be justified only by its end. But the end in its turn needs to be justified. From the Marxist point of view, which expresses the historical interests of the proletariat, the end is justified if it leads to increasing the power of man over nature and to the abolition of the power of man over man.

Other2 moral ideas are really expressions of self- or, especially, class- interest, indeed tools in the class struggle:

Morality is one of the ideological functions in this struggle. The ruling class forces its ends upon society and habituates it into considering all those means which contradict its ends as immoral. That is the chief function of official morality. It pursues the idea of the "greatest possible happiness" not for the majority but for a small and ever diminishing minority. Such a regime could not have endured for even a week through force alone. It needs the cement of morality. The mixing of this cement constitutes the profession of the petty-bourgeois theoreticians, and moralists. They dabble in all colors of the rainbow but in the final instance remain apostles of slavery and submission.

But if you really want to know whether something is good or bad, Trotsky says, you ask whether it really conduces to "the liberation of mankind", to "to increasing the power of man over nature and to the abolition of the power of man over man". Intentions don't matter, nor do formal similarities; what matters is whether means and acts really help advance this over-riding end. Thus, explicitly, even terrorism can be justified under conditions where it will be effective (as when Trotsky practiced it during the Civil War).

Trotsky did not, of course, have occasion to contemplate eliminating an extra-terrestrial civilization, but I think his position would have been clear.

## The Historic Route

The good-means-good-for-me, might-is-right theme is also one with a long history in western philosophy, often as the dreadful fate from which philosophy will save us, but sometimes as the liberating truth which philosophy reveals. The means that something like the true knowledge could, paradoxically enough, be developed out of the classical western tradition.

The obvious way to do this would be to start from figures like Nietzsche who have said pretty similar things. Most of these 19th and 20th century figures would of course have looked on the Solar Union with utter horror, but even so there is, I think, a way there. Many of these philosophers simultaneously celebrate power and bemoan the way in which great, powerful are dragged down or confined by the weak. This creates a tension, if not an outright contradiction. Who is really more powerful? Clearly, if the mediocre masses can collectively dominate and overwhelm the individually magnificent few, the masses have more power. As Hume said, albeit in a somewhat different context, "force is always on the side of the governed". (Or again: "Such a regime could not have endured for even a week through force alone".) Someone who was willing to combine Nietzsche's celebration of power with a frank assessment of both their own power as an isolated individual and of the potential power of different groups could well end up at the true knowledge.

Even less work would be to go further back into the past, to the great figures of the 17th century, like Hobbes and, most especially, Spinoza. Here we find thinkers willing to found, if not socialism, then at least social and political life on "pessimistic and cynical conclusions about human nature". The latter's Political Treatise is quite explicit about the pessimism and the cynicism:

[M]en are of necessity liable to passions, and so constituted as to pity those who are ill, and envy those who are well off; and to be prone to vengeance more than to mercy: and moreover, that every individual wishes the rest to live after his own mind, and to approve what he approves, and reject what he rejects. And so it comes to pass, that, as all are equally eager to be first, they fall to strife, and do their utmost mutually to oppress one another; and he who comes out conqueror is more proud of the harm he has done to the other, than of the good he has done to himself. [Elwes edition, I.5]

Spinoza is equally clear that one's rights extend exactly as far as one's power3, and that the reason people band together is to increase their power4. It is precisely on this basis that Spinoza came to advocate democracy, as uniting more of the power of the people in the commonwealth, especially their powers of reasoning. Of course Spinoza's political views were not the true knowledge, but he actually provides a surprisingly close starting point, and reasoning from his premises and the stand-point of someone who knows they are not going to be at the top of the heap unless they level it all would get you most of the rest of the way there. This would include Spinoza's idea that obedience, allegiance, even solidarity are all dissolved when they are no longer advantageous.

I want to mention one more pseudo-ancestor for the true knowledge. I said before that the themes that might is right, and "good" means "good for me", are an ancient ones in the history of philosophy, but they were introduced as the awful dangers which ethics is supposed to save us from. All the way back in The Republic, we find clear statements of the idea that might is right, that the alternative to pursuing self-interest is sheer stupidity, and that cooperation emerges from alignment of interests. We are supposed to recoil from these ideas in horror, but they can only arouse horror if it seems like there's something to them5. The danger with this tactic is that the initial presentation of the amoralist ideas may end up seeming more convincing than their later refutation. (I think that's the case even in The Republic.) And then one is reduced to talking about how refusing to accept that some transcendental, unverifiable ideas are true will lead to bad-for-you consequences in this world, and the game is over.

## Evolutionary Game Theory as the True Knowledge

No doubt some scholars in the Solar Union will, as I have done above, play the game of trying to find retrospective anticipations of some idea in the words of people who were really saying something else. On the other hand, at some point the true knowledge leaves its bonded-labor camps, joins up with the Sino-Soviet army, and starts expanding "from Vladivostok to Lisbon, from sea to shining sea". As it moves into the wider world, it encounters scientific knowledge considerable more up to date than Darwin and Engels. Does this set the stage for another shameful and self-defeating episode of an ideology trying desperately to hold on to a bit of fossilized science?

I actually don't see why it should. There are scientific theories nowadays which try to address the sort of questions that the true knowledge claims to answer, and I don't think the answers are really that different, though they are not usually presented so starkly.

Biologically, life is a process of assimilating matter and energy, of appropriating parts of the world to sustain itself. Nothing with a stomach is innocent of preying on other living things, and even plants survive, grow, and reproduce only by consuming their environment and re-shaping it to their convenience. The organisms which are better at appropriating and changing the world to suit themselves will live and expand at the expense of those which are worse at it. Those organisms whose acts serve their own good will do better for themselves than those which don't — whether or not that might in some extra-mundane sense be right or just. Abstract goods keep nothing alive, help nothing to grow; self-seeking is what will persist, and everything else will perish. And then when we throw these creatures together, they will inevitably compete, they will rival and oppose. Of course they can aid each other, but this aid will take the form of more effective exploitation of resources, including other life.

There is now a whole sub-field of biology devoted precisely to understanding when organisms will cooperate and assist each other, namely evolutionary game theory. It teaches us conditions for the selection of forms of reciprocity and even of solidarity, even among organisms without shared genetic interests. But those are, precisely, conditions under which the reciprocity and solidarity advance self-interest; it's cooperation in the service of selfishness.

Take the paradigm of the prisoners' dilemma, but tell it a bit differently. Alice and Babur are two bandits, who can either cooperate with each other in robbing villages and caravans, or defect by turning on each other. If they both cooperate, each will take $1,000; if they both defect, neither can steal effectively and they'll get$0. If Alice cooperates and Babur defects by turning on her, he will get $2,000 and she will lose$500, and vice versa. This has exactly the structure of the usual presentations of the dilemma, but makes it plain that "cooperation" is cooperation between Alice and Babur, and can perfectly well be cooperation in preying upon others. It's a famous finding of evolutionary game theory that a strategy of conditional cooperation, of Alice cooperating with Babur until he stops cooperating with her and vice versa, is better for those players than the treacherous, uncooperative one of their turning on each other, and that a population of conditional cooperators will resist invasion by non-cooperators6. Such strategies of cooperation in exploiting others are what the field calls "pro-social behavior"[^nbandits].

Since evolutionary game theorists are for the most part well-adjusted members of bourgeois society, neither psychopaths nor revolutionaries, they do not usually frame their conclusions with the starkness which their own theories would really justify; in this respect, there has been a decline since the glory days when von Neumann could pronounce that "It is just as foolish to complain that people are selfish and treacherous as it is to complain that the magnetic field does not increase unless the electric field has a curl." If we could revive some of that von Neumann spirit, a fair synthesis of works like The Evolution of Cooperation, The Calculus of Selfishness, A Cooperative Species, Individual Strategy and Social Structure, etc., would go something like this: "Cooperation evolves just to the extent that it both advances the self-interests of the cooperators, and each of them has enough power to make the other hurt if betrayed. Everything else is self-defeated, is 'dominated'. Typically, the gains from cooperation arise from more effectively exploiting others. Also, inside every positive-sum story about gains from cooperation, there is a negative-sum struggle over dividing those gains, a struggle where the advantage lies with the already-stronger party." A somewhat more speculative addendum would be the following: "We have evolved to like hurting those who have wronged us, or who have flouted rules we want them to follow, because our ancestors have had to rely for so many millions of years on selfish, treacherous fellow creatures, and 'pro-social punishment' is how we've kept each other in line enough to take over the world."

There is little need to elaborate on how neatly this dovetails with the true knowledge, so I won't7. This alignment is, I suspect, no coincidence.

Given these points, how do we think about choices between who to cooperate with, or even whether to cooperate at all? Look for those whose interests are aligned with yours, and where cooperation will do the most to advance your interests — to those with the most power, most closely aligned with you. To neglect to ally oneself when it would be helpful is not wicked — what has wickedness to do with any of this? — but it is stupid, because it leads to needless weakness.

At this point, or somewhere near it, the Sheenisov must have made a leap which seems plausible but not absolutely compelling. The united working class is more powerful than the other forces in capitalism, the last of the "tool-making cultures of the Upper Pleistocene". To throw in with that is to get with the strength. Why solidarity? Because it's the source of power. At the same time, it's a source of strength which can hardly tolerate other, rival powers — organized non-cooperators, capitalist and statist remnants, since they threaten it, and it them.

These arguments would apply to any sort of organism — including Jovian post-humans as well as us, and so Ellen May seems to me to have very much the worse of her argument with Mary-Lou Radiation Nation Smith:

"They're not monsters, you know. Why should you expect beings more powerful and intelligent than ourselves to be worse than ourselves? Wouldn't it be more reasonable to expect them to be better? Why should more power mean less good?" I could hardly believe I was hearing this. ... I searched for my most basic understanding, and dragged it out: "Because good means good for us!" Mary-Lou smiled encouragingly and spoke gently, as though talking someone down from a high ledge. "Yes, Ellen. But who is us? We're all — human, post-human, non-human — machines with minds in a mindless universe, and it behoves those of us with minds to work together if we can in the face of that mindless universe. It's the possibility of working together that forges an us, and only its impossibility that forces a them. That is the true knowledge as a whole — the union, and the division."8

(The worse of the argument, that is, unless Ellen May can destroy the fast folk, in which case there is no power to either unite with or to fear. "No Jovian superintelligences, no problem", as it were.)

## But What If It Should Come to Be Generally Known?

As I said earlier, contemporary scientists studying the evolution of cooperation do not usually put their conclusions in such frank terms as the true knowledge. I don't even think that this is because they're reluctant to do so; I think it genuinely doesn't occur to them. (And this despite things like one of the founders of evolutionary game theory, John Maynard Smith, being an outright Marxist and ex-Communist.) Even when people like Bowles and Gintis — not Marxists, but no strangers to the leftist tradition — try to draw lessons from their work, they end up with very moderate social democracy, not the true knowledge. Since I know Bowles and Gintis, I am pretty sure that they are not holding back...

Why so few people are willing to push these ideas to (one) logical conclusion is an interesting question I cannot pretend to answer. I suspect that part of the answer has to do with people not having grown up with these ideas, so that the theories are used more to reconstruct pre-existing notions than as guides in their own right. If that's so, then a few more (academic) generations of their articulation, especially if some of the articulators should happen to have the right bullet-swallowing tendencies, could get us all the way to the true knowledge being worked out, not by bonded laborers but by biologists and economists.

This presents points where, I think, the true knowledge might not lead to the attractive-to-me Solar Union, but rather somewhere much darker. If I am a member of one of the subordinate classes, well, the strongest power locally is probably the one dominating me. Maybe solidarity with others would let me overthrow them and escape, but if that united front doesn't form, or fails, things get much, much worse for me. The true knowledge could actually justify obedience to the powers that be, if they're powerful enough, and not enough of us are united in opposition to them.

The other point of failure is this. If I am a member of an oppressing or privileged class, what lesson do I take from the true knowledge? Well, I might try to throw in my lot with the power that will win — but that means abandoning my current goods, the things which presently make me strong and enhance my life. My interest is served by allying with those who are also beneficiaries of inequality, and making sure the institutions which benefit me remain in place, or if they change alter to be even more in my favor. Members of a privileged class in the grip of moralizing superstition might sometimes be moved by pity, sympathy, or benevolence. Rulers who have themselves accepted the true knowledge will concede nothing except out of calculation that it's better for itself than the alternative. Voltaire once said something to the effect that whether or not God existed, he hoped his valet believed in Him; it might have been much more correct for Voltaire's valet to hope that his master, and still more rulers like Frederick the Great, feared an avenging God.

My somewhat depressing prospect is that our ruling classes are a lot more likely to talk themselves into the true knowledge by the evolutionary route than the rest of us are to discover revolutionary solidarity — though whether the occasional fits of benevolence on the part of rulers really make things much better than a frank embrace of their self-interest would is certainly a debatable proposition.

## Clicking and Giving Offense

If anyone does want to start propagating the true knowledge, I think it would actually have pretty good prospects. A number of sociologists (Gellner, Boudon) have pointed out that really successful ideologies tend to combine two features. One is that they have a core good idea, one which makes lightbulbs go on for people. Since I can't put this better than Gellner did, I'll quote him:

The general precondition of a compelling, aura-endowed belief systems is that, at some one point at least, it should carry overwhelming, dramatic conviction. In other words, it is not enough that there should be a plague in the land, that many should be in acute distress and in fear and trembling, and that some practitioners be available who offer cure and solace, linked plausibly to the background beliefs of the society in question. All that may be necessary but it is not sufficient. Over and above the need, and over and above mere background plausibility (minimal conceptual eligibility), there must also be something that clicks, something which throws light on a pervasive and insistent and disturbing experience, something which at long last gives it a local habitation and a name, which turns a sense of malaise into an insight: something which recognizes and places an experience or awareness, and which other belief systems seem to have passed by.9

I think MacLeod gets this — look at how Ellen May talks about the true knowledge "struck home with the force of a revelation" (ch. 5, p. 89). But the click for the true knowledge is how it evades the common pitfall of attempts to work out materialist or naturalist ethics. After grounding everything in self-interest and self-assertion, there is a very strong tendency to get into mere self-assertion; "good" means "good for me, and for me alone". The true knowledge avoids this; it gives you a way of accepting that you are a transient, selfish mind in a mindless, indifferent universe, and sloughing off thousands of years of accumulated superstitious rubbish (from outright taboos and threats of the Supreme Fascist to incomprehensible commands from nowhere) — you can face the light, and escape the bullshit, and yet not be altogether a monster.

(Boudon would add something to Gellner's requirement that an ideology click: the idea should also be capable of "hyperbolic" use, of being over-applied through neglecting necessary qualifications and conditions. Arguably, the whole plot of The Cassini Division is driven by Ellen May's hyperbolization of part of the true knowledge.)

Clicking is one condition for an ideology to take off; but there's another.

Though belief systems need to be anchored in the background assumptions, in the pervasive obviousness of an intellectual climate, yet they cannot consist entirely of obvious, uncontentious elements. There are many ideas which are plainly true, or which appear to be such to those who have soaked up a given intellectual atmosphere: but their very cogency, obviousness, acceptability, makes them ineligible for serving as the distinguishing mark of membership of a charismatic community of believers. Demonstrable or obvious truths do not distinguish the believer from the infidel, and they do not excite the faithful. Only difficult beliefs can do that. And what makes a belief difficult? There must be an element both of menace and of risk. The belief must present itself in such a way that the person encountering, weighing the claim that is being made on him, can neither ignore it nor hedge his bets. His situation is such that, encountering the claim, he cannot but make a decision, and it will be a weighty one, whichever way he decides. He is obliged, by the very nature of the claim, to commit himself, one way or the other.10

The true knowledge would have this quality, that Gellner (following Kirkegaard) calls "offense", in spades.11

I'll close with two observations about this combination of click and offense. One is that it is of course very common for a certain sort of fiction, and science fiction often indulges in it. Heinlein, in particular, was very good at it, and in some ways The Cassini Division is, the color of Ellen May's hair notwithstanding, a very Heinleinian book, and Ellen May explaining the true knowledge to us is not that different from being on the receiving end of one of Heinlein's in-story lectures. (I know someone else made these points before me, but I can't remember who.) One of the things which makes me like MacLeod's books better than Heinlein's, beyond the content of the lectures appealing more to my prejudices, is that even in the story world, the ideas get opposed, and there is real argument.

The other observation is that MacLeod of course comes out of the Trotskyist tradition, part of the broader family of Communisms. During its glory days, when it was the "tragic hero of the 20th century", Communism quite certainly combined the ability to make things click with the ability to give offense. This must have been one of MacLeod's models for the true knowledge. MacLeod is not any longer any sort of Communist ("the actual effect" of Communism "was to complete the bourgeois revolution ... and to clear the ground for capitalism") or even Marxist, but there is a recurring theme in his work of some form of the "philosophy of praxis" re-appearing. One of the core Marxist ideas, going all the way back to the beginning, is that socialism isn't just an arbitrary body of ideas, but an adaptive response to the objective situation of the proletariat. Even if the very memory of the socialist movement were to vanish, it is (so the claim goes) something which life under capitalism will spontaneously regenerate. One symbol of this in MacLeod's fiction is the scene at the very end of Engine City, where a hybrid creature formed from the remains of three executed revolutionaries crawls from a mass grave. The formation of the true knowledge is another.

I don't, of course, actually believe in the true knowledge, but I find it hard to say why I shouldn't; this makes it, for me, one of MacLeod's more compelling creations. I have kept coming back to it for more than fifteen years now, and I doubt I'm done with it.

1. The Cassini Division, ch. 5, pp. 89--90 of the 1999 Tor edition; ellipses and italics in the original.^

2. Notice how Trotsky says the "interests of the proletariat" lie in "increasing the power of man over nature", not increasing the power of the proletariat over nature, and in "the abolition of the power of man over man", not abolishing the power of others over the proletariat (either as a whole or over its individual members). Thus he can reconcile saying that all moral ideas express a class standpoint with saying that his goals are for the benefit of all humanity. There is an implicit appeal here to an idea which goes back to Marx and Engels, that, because of the proletariat's particular class position, the only way it can pursue its interest is through universal liberation of humanity. What can one say but "how convenient"?^

3. "every natural thing has by nature as much right, as it has power to exist and operate" (II.3); "And so the natural right of universal nature, and consequently of every individual thing, extends as far as its power: and accordingly, whatever any man does after the laws of his nature, he does by the highest natural right, and he has as much right over nature as he has power" (II.4); "whatever anyone, be he learned or ignorant, attempts and does, he attempts and does by supreme natural right. From which it follows that the law and ordinance of nature, under which all men are born, and for the most part live, forbids nothing but what no one wishes or is able to do, and is not opposed to strifes, hatred, anger, treachery, or, in general, anything that appetite suggests" (II.8); "Besides, it follows that everyone is so far rightfully dependent on another, as he is under that other's authority, and so far independent, as he is able to repel all violence, and avenge to his heart's content all damage done to him, and in general to live after his own mind. He has another under his authority, who holds him bound, or has taken from him arms and means of defence or escape, or inspired him with fear, or so attached him to himself by past favour, that the man obliged would rather please his benefactor than himself, and live after his mind than after his own" (II.9--10).^

4. "If two come together and unite their strength, they have jointly more power, and consequently more right over nature than both of them separately, and the more there are that have so joined in alliance, the more right they all collectively will possess." (II.13).^

5. It would be horrifying if everyone were followed around by a drooling slimy befanged monster, careful to hide itself out of our sight, which might devour any one of us without warning at any moment. A philosophy which offered to re-assure us that lurking monsters do not follow us around would arouse little interest.^

6. The basic tit-for-tat strategy is not evolutionarily stable against invasion by more forgiving conditional cooperators, which leads to a lot of technically interesting wrinkles, which you can read about in, say, Karl Sigmund's great Games of Life. But various attempts to dethrone "strong reciprocity" (e.g., "Southampton" strategies, "zero-determinant" strategies) have all, so far as I know, proved unsuccessful.^

7. If I were going to elaborate, I'd have a lot to say about this bit from The Cassini Division (ch. 7, p. 144): "Without power, respect is dead. But our power needn't be the capacity to destroy them — our own infants, and many lower animals, have power over us because our interests are bound up with theirs. Because we value them, and because natural selection has built that valuing into our nervous systems, to the point where we cannot even wish to change it, though no doubt if we wanted to we could. This is elementary: the second iteration of the true knowledge."^

8. Cassini Divsion, ch. 10, p. 216, my ellipses.^

9. The Psychoanalytic Movement: The Cunning of Unreason, first edition (Evanston, Illinois: Northwestern University Press, 1996), p. 39.^

10. The Psychoanalytic Movement, pp. 40--41.^

11. The Cassini Division, ch. 5, pp. 93--94: "I think about being evil. To them, I realize, we are indeed bad and harmful, but — and the thought catches my breath — we are not bad and harmful to ourselves, and that is all that matters, to us. So as long as we are actually achieving our own good, it doesn't matter how evil we are to our enemies. Our Federation will be, to them, the evil empire, the domain of dark lords; and I will be a dark lady in it. Humanity is indeed evil, from any non-human point of view. I hug my human wickedness in a shiver of delight."^

Posted at May 12, 2015 13:53 | permanent link

## May 05, 2015

I have been very much distracted from blogging by teaching undergraduates (last semester; this semester), by supervising graduate students, and by Life. Thus even this link round-up is something I literally began years ago, and am only now posting for lack of time to do real blogging.

Posted at May 05, 2015 22:28 | permanent link

## April 30, 2015

### Books to Read While the Algae Grow in Your Fur, April 2015

Attention conservation notice: I have no taste.

Christian Caryl, Strange Rebels: 1979 and the Birth of the 21st Century
A very nicely written popular history of five movements that either began or reached a peak in 1979: the Iranian Revolution, the Soviet-Afghan War, Deng Xiaoping's economic reforms in China, Margaret Thatacher's government in Britain, and John Paul II's first visit as Pope to Poland, viewed as part of a moral campaign against the Soviet Union. Caryl, quite rightly, views these all as anticipations of trends that have come to shape our world, and look like to keep shaping the 21st century --- the economic rise of China, the collapse of Soviet Communism, neoliberalism, Islamism. (I don't believe he ever uses the word "neoliberal" or its derivatives.) Beyond the coincidence of dates, he also links them through their opposition to what, for most of the 20th century, could have been seen as its dominant trends of secularization and socialism, though he's very careful to note how, e.g., post-revolutionary Iran retained and even amplified many modernization initiatives of the Shah's regime, or how Thatcher left alone much of the British welfare state. (He has a nice passage, which irritatingly I cannot find again, about how it's much easier to be a rugged individualist when you know you can go to a hospital for free if you're sick, and will have a guaranteed pension when you're too old to work.) He is also, mercifully, restrained in claiming causal connections between these events &emdash; the biggest is how much the USSR's military commitments in Afghanistan limited its ability to use force in eastern Europe — or abstract, thematic parallels. I think he's too inclined to give credit to Thatcher's economic policies, but otherwise I have few complaints or quibbles. Recommended.
Sydney Padua, The Thrilling Adventures of Lovelace and Babbage: The (Mostly) True Story of the First Computer
If you enjoy this weblog, it is very likely that you are part of the target audience for an irregular comic, in which Lovelace and Babbage were hived off into a pocket universe in which they actually built the Analytic Engine, and used it to fight crime. This is precisely that comic, with footnotes and a bonus material about how the Engine would have looked and worked. I recommend it very highly.
Michael Ellman, Socialist Planning
A sort of soberer older cousin to Red Plenty; at least in the first edition of 1979, which is what I read. (I am curious to see what revisions Ellman made later, since 1979 was of course when things began to change radically...)
The book primarily attends to the USSR and its satellites in eastern Europe; the secondary focus is on China. (There are occasional discussions of Yugoslavia and Cuba, and mentions of the Communist governments of southeast Asia, but nothing about how they actually ran things; and I don't believe North Korea is ever named at all.) There was a lot of interesting and valuable detail about how the Soviets actually drew up plans and tried to implement them. About China Ellman had to be sketchier, because there was simply so little available information, and because the process was itself more chaotic. (While Ellman was evidently more skeptical about Maoism and the Cultural Revolution than many westerners were in the 1970s, this is one places where I hope he'd like to revise and extend his remarks.) Indeed, from reading Ellman, I find myself doubting that pre-Deng China really had economic plans, as opposed to mere orders...
Ellman's attempts to draw up some sort of assessment of the accomplishments and defects of state socialism tries to do this from both a liberal and a Marxist perspective. I think he wasn't harsh enough on the ways capitalist countries fail liberalism, or on the ways state socialist countries failed socialism. Whether what's come since in the former second world is an improvement from either point of view is a more complex question, which this book is necessarily silent on.
Gene Wolfe, Sword of the Lictor
In which Severian commits another crime of mercy, wanders in the wilderness (much of it consisting of relics of former civilizations), and is offered all the kingdoms of the world if he will but fall down and worship a resurrected two-headed tyrant. Also, a story is told which is somewhere between the myth of Romulus and Remus, and The Jungle Book.
I wish I understood how exactly Wolfe manages to convey the sense that he understands all the mysteries he is hinting at, without actually explaining much of anything. (Previously: 1, 2; subsequently: 4)
Milovan Djilas, The New Class: An Analysis of the Communist System
Djilas was probably the highest-ranking member of any ruling Communist party to become an anti-Communist. This was his 1957 attempt at an explanation of what Communist governments, like the one he helped found in Yugoslavia, were actually up to. This analysis was plainly very strongly influenced by Marxism, because it's all about class struggle and the over-riding historical imperative of enhancing production. Dividing through for some of that, Djilas's big idea is that Communism amounted to collective ownership, not by the whole people, but by a "new class" of economic managers and party functionaries. This new class pursued rapid industrialization partly so they would have more to exploit, but even more so as to not be utterly overwhelmed by the advanced capitalist countries. Having a superior position to other classes within the countries they ruled, they naturally used it in self-interested ways; thus class conflict, far from being eliminated, persists. Djilas explains Communist dictatorship, suppression of freedom of thought and other civil liberties, etc., as the new class's means of maintaining its firm position of collective ownership against possible threats, not so much from reactionaries as from other classes. (Cf. the much later joke about Brezhnev's mother.) I have to confess that I cannot follow his explanation of why the new class's relationship relations of production are incompatible with democracy within the Party, or even with "formal" democracy for the country as a whole. (After all, capitalist and even slave-owning societies have been compatible with republicanism and democracy for privileged classes.) While recognizing the diminution of the intensity of repression after Stalin, Djilas emphasized that this involved no change in principle or actual entrenchment of rights the state and party were bound to respect.
As I said, this strikes me as not just more-or-less Marxist but an amplification of the Trotskyist view of what went wrong in the USSR --- though without Trotsky's need to justify the 1917 Revolution and his own actions in helping build the Soviet state. That is, Djilas doesn't regard the new class as some sort of bureaucratic degeneration of a workers' state, still less a reactionary restoration of capitalism, but rather as the logical end-point of the Communist trajectory.
Wisely, Djilas offered no concrete forecasts as to what would happen to the Communist governments. I don't feel competent to say whether subsequent events were compatible with his theory.
There was a curious after-life to Djilas's theory of the "new class", since it was picked up by American neo-conservatives and used as a club in the culture wars. Their theory was (I am not making this up) that a tendency to social and cultural liberalism on the part of teachers, journalists and entertainers is just like the apparatchiks' complete control of the economic resources and repressive force of totalitarian states. There is thus a (thin, twisty, strained) line of intellectual descent from Djilas to clowns like Glenn Beck; it would be grimly amusing to read a full history of this some day.
Gareth Hinds, MacBeth
A graphic-novel adaptation, aimed at younger readers. The selections from the text, and minor modernizations to vocabulary, are all well-chosen. More important, the drawing is excellent, and actually really amplifies the text; I doubt I will ever see Lady MacBeth, the witches, or Banquo other than this way. (Hinds's MacBeth will however compete in my memory with Toshiro Mifune.)
Disclaimer: Hinds is married to a friend of my brother's. This is how I came to look at his book, but has no (conscious) bearing on my review.
Ian Tregillis, The Mechanical
Mind candy: a sort of alternate-historical fantasy, in which Huygens, by evidently stealing ideas from Newton invented alchemical-mechanical golems (called "clakkers"), leading to the Dutch taking over all of the world except for New France, home of the Papacy and French monarchy in exile. Of course, clakkers turn out to be conscious, to experience excruciating pain when they receive orders, and to have some mysterious means of liberation... (There is a lot about Spinoza, which will probably become clearer in the inevitable sequels.) It's catnip for those who have spent far too much time reading about 17th-century science and philosophy, especially in the United Provinces, while simultaneously liking fantasy.
Tregillis's self-presentation.
Marie Brennan, Voyage of the Basilisk
Mind candy. I enjoyed it, but I suspect it's only for those who have read previous installments in Lady Trent's adventures.
David Brion Davis, Inhuman Bondage: The Rise and Fall of Slavery in the New World
Popular history by a respected historian. (He says at the beginning that it began as notes for a short course for high school teachers, which strikes me as an excellent thing.) While slavery in the United States occupies most of the book, Davis is very good at setting that in the context of slavery throughout the Americas (as the subtitle indicates), and indeed the broader historical context of slavery in Europe, southwest Asia and Africa. What becomes depressingly clear from reading this (if it wasn't already) is just how utterly central slavery was to the formation of the modern world economy and to the European colonization of the Americas, and just how monstrous an institution it was. I can't decide if that makes the first word of the title well-chosen, or if it shouldn't rather force us to define down our notion of what acting like a human being means. (One must imagine Simon Legree saying "Am I not a man and a brother?") That abolitionism became a serious movement anywhere, let alone one which was able to succeed in some country through the force of mere persuasion, is almost as astonishingly hopeful as the history of slavery is depressing. Davis's narrative of the slave power within the US, culminating in enormously destructive treason in defense of slavery (*), is exemplary, as is the treatment of Lincoln and the failure of reconstruction.
*: I cannot now recall where I learned this apt phrase; Davis does not, I believe, use it.
Randal Douc, Eric Moulines and David S. Stoffer, Nonlinear Time Series: Theory, Methods, and Applications with R Examples [R code, errata, data]
This is a thorough, but introductory, treatment of the statistical theory of parametric nonlinear time series models, illustrated with many concrete mathematical and data-analytic examples. The emphasis is however on the theory, I think quite rightly, because the theory is crucial for understanding what methods work and why they do so.
The book consists of three parts, only loosely linked. Part I is on basic time series models and methods: it reduces all of linear-Gaussian model theory to two chapters, including ARMA, time- and frequency- domain methods, and the linear Gaussian state-space model. Chapters 3 and 4 tour various nonlinear models, mostly parametric ones, and illustrate some of the need for non-linearities.
Part II (chapters 5--8) is a short course on the ergodic theory of Markov processes. Like everyone else since the early 1990s, their treatment of this subject is heavily influenced by Meyn and Tweedie's Markov Chains and Stochastic Stability. Douc et al. particularly emphasize uniform and V-geometric ergodicity, that is, conditions under which a Markov process's state distribution converges exponentially fast to an invariant or equilibrium distribution. (The ugly phrase "V-geometric" combines the notion of an exponential or geometric convergence rate with a particular way of measuring distance between distributions, generalizing total variation or $L_1$ distance.) Given rapid convergence of distributions, especially with calculable rates, one can get limit theorems about time averages, both almost-sure ("Birkhoff") limit theorems which, like the law of large numbers, say that time averages converge on expectation values, and central limit theorems giving the Gaussian fluctuations around the limits. These in turn provide the foundations for maximum likelihood estimation of Markov models, and what Douc et al., following Cox (sec. 3.2), call "observation-driven" models, a.k.a. chains with complete connections or stochastic automata. Many of the specific models from Part I reappear in this part as examples, but I believe one could read Part II without ever looking at Part I.
Part III is about state-space or hidden Markov models, where there is a latent or hidden state that evolves according to a Markov process, and we observe a noisy, generally nonlinear, function of the state. The special case where the state evolves linearly, the observation is linear in the state, and all noise terms are Gaussian allows for an exact treatment, given in chapter 2. The other special case where the state and the observation are both discrete also allows for exact treatment, the EM algorithm (as originally invented by Baum and Welch). The key problems for state-space models is that of estimating the current state given all observations up to the present, called "filtering", and estimating the sequence of states over some time interval given observations over a (possibly longer) interval, called "smoothing". If the model parameters are known, then the solution to both problems is given formally by Bayes's rule (cf.), but of course it cannot actually be calculated, so one must approximate.
Chapters 10 and 11 explain the "particle" approach to filtering and smoothing. The particle filter is easier to explain than the particle smoother, so I'll stick with the former. Start by randomly drawing $N$ "particles" in the hidden state space according to the initial state distribution, call these $X_0^1, X_0^2, \ldots X_0^N$. (The subscript indicates time, the super-script indexes particles.) Then have each particle evolve its state independently, according to the Markov process for the states, so $p(\tilde{X}_1^i|X_0^i) = q(X_0^i,\tilde{X}_1^i)$ for the fixed Markov kernel $q$ describing the state evolution. (Remember, we're assuming the parameters are known for the moment.) Now take the first observation $Y_1$. It will have some conditional probability (or probability density) given each particle's state, $p(Y_1| \tilde{X}_1^i) = r(\tilde{X}_1^i, Y_1)$. Resample the particles with probabilities proportional to the $r(\tilde{X}_1^i, Y_1)$, so that we get $N$ particles $X_1^1, X_1^2, \ldots X_1^N$. It is not hard to convince oneself that (as $N\rightarrow\infty$) the distribution of $X_1$ particles converges on the actual distribution $p(X_1|Y_1)$. Now evolve the $X_1$ particles to get $\tilde{X}_2$ and repeat the cycle with the next observation, $Y_2$. Moreover, by throwing enough particles at the problem, we get a consistent approximation to $p(X_t|Y_1, \ldots Y_t)$, and this approximation can even be kept up over time.
Chapter 12 extends particle filtering and smoothing to likelihood-based inference, i.e., to approximating, $p(Y_1, \ldots Y_n; \theta)$, where the parameter $\theta$ now influences both the state evolution (above, $q$) and the observation mechanism (above, $r$). Given a decent approximation to the likelihood, one can (with some care) maximize it, or one can dilute its evidence about $\theta$ with a prior and do Bayesian inference. Chapter 13 provides a very nice information-theoretic treatment of the consistency and asymptotic efficiency of maximum likelihood, without presuming that the model is well-specified. The consistency of Bayesian inference is not touched on, which may be just as well because it's a much harder problem.
The mathematical exercises are good; I confess I didn't try the computational ones. I would be very happy to use this book for a graduate course in time series theory, emphasizing Parts II and III, and probably combined with Fan and Yao for non-parametric models, and/or Gourieroux and Monfort for simulation-based inference of scientific models.
Disclaimer: I know Prof. Stoffer slightly, which is, I presume, why the publisher sent me an unsolicited review-copy of this book last year.

Posted at April 30, 2015 23:59 | permanent link

## April 09, 2015

### "Sparse Graph Limits with Applications to Machine Learning" (Week after Next at the Statistics Seminar)

Attention conservation notice: Notice of an upcoming academic talk at Carnegie Mellon. Only of interest if you (1) care about how the mathematics of graph limits intersects with non-parametric network modeling, and (2) will be in Pittsburgh week after next.
Jennifer Chayes, "Sparse Graph Limits with Applications to Machine Learning"
Abstract: We introduce and develop a theory of limits for sequences of sparse graphs based on $L^p$ graphons, which generalizes existing theories of graph limits, and in particular includes graphs with power-law degree distributions. We then apply these results to nonparametric stochastic block models, which are used by statisticians to analyze large networks. We use our sparse graph limit results to derive strong results on estimation of functions characterizing these nonparametric stochastic block models. This talk assumes no prior knowledge of graphons or stochastic block models. The talk represents joint work with Christian Borgs, Henry Cohn, Shirshendu Ganguly, and Yufei Zhao.
Time and place: 4 pm on Monday, 20 April 2015, in Scaife Hall 125

As always, the talk is free and open to the public.

(I'd write something long here about why graph limits are so interesting, but why repeat myself?)

Posted at April 09, 2015 19:02 | permanent link

## March 31, 2015

### Books to Read While the Algae Grow in Your Fur, March 2015

Attention conservation notice: I have no taste.

Anthony Shadid, House of Stone: A Memoir of Home, Family, and a Lost Middle East
Shadid's memoir of restoring his family's ancestral home in a small Christian town in south Lebanon, inter-cut with the scenes from the story of how his family came from Lebanon to Oklahoma, and the history of the town itself. It's more fascinating and lovely than an account of a mid-life crisis resolved through remodeling has any right to be, and becomes almost unbearably sad when one knows about how the author died.
Debra Doyle and James D. Macdonald, Mageworlds series: The Price of the Stars, Starpilot's Grave, By Honor Betray'd, The Gathering Flame, The Long Hunt, The Stars Asunder, A Working of Stars
Mind candy. I'd read the first two years and years ago, and enjoyed them, but was inspired to pick up the rest of the series by James Nicoll's recent review of the first book. While it would be astonishing if the story didn't begin as Star Wars fanfic, one should really think of these as that universe re-imagined by writers of talent and imagination, trying to come up with decent reasons for everything, and freely reworking as necessary. I won't pretend they're high literature, but they are a non-guilty pleasure.
Daniel Davies and Tess Read, The Secret Life of Money: Everyday Economics Explained
A compact series of brief vignettes about the economics of lots of different sorts of businesses, running from trade-shows to martial arts schools. It's not as hilariously funny as the best of Davies's blogging, but it is good, and makes me wish they'd be quixotic enough to write a systematic econ-for-beginners book.
(The writing is colloquial enough that I found myself looking up perhaps half-a-dozen Anglicisms; I didn't mind, but others might.)
Disclaimer: I've been a fan of Davies's blog, and sporadic correspondent, for many years.
Charlaine Harris, A Secret Rage
Harris's second (?) novel, from 1984, a mystery about an ex model returning from New York to a small Southern college town — which is not dealing very successfully with a serial rapist. I hesitate to label this one "mind candy", because it deals with much more serious themes than usual, and does so well. I can't speak to its portrayal of surviving rape, but it's really convincing at the emotional pain, dis-orientation, and shame that can come from experiences that break one's self of who one is, or the kind of live one has, including the feeling of "this can never be repaired". In the end, though, I'm not sure it isn't candy of a sort. (That is not a complaint or a put-down.)
— Of course, this was written more than thirty years ago. It's striking to me how little the cultural politics have moved on, even while many things large and small for the story are quite transformed. Ones which struck me (some of them arguably spoilers, so I ROT-13'd): gur bcravat ivrj bs Arj Lbex Pvgl nf n uryyfpncr, be ng yrnfg n chetngbel, bs ivbyrag pevzr; vg'f abg orvat pbzzba xabjyrqtr gung encvfgf, bapr pnhtug, pna or purzvpnyyl zngpurq gb gurve fcrez; gjb beqvanel crbcyr orvat qrsrngrq ol gur cebfcrpg bs pbyyngvat gjb yvfgf bs n uhaqerq anzrf; gur urebvar'f oblsevraq glcvat ure rffnlf sbe ure (abg uvf orvat pbafvqrengr gung jnl, ohg gur vqrn gung fur jbhyqa'g whfg unir jevggra gurz urefrys ba n znpuvar); gur urebvar'f abg dhrfgvbavat sbe n zbzrag gung n pynff va Punhpre jvyy uryc ure orpbzr n pbagrzcbenel abiryvfg.
Laura Bickle, Dark Alchemy
Mind candy: contemporary Weird Western, heavy on alchemical symbolism (reasonably well-researched). Lots of mysteries are left unexplained at the end; I hope they stay that way.
Stephen King, Mr. Mercedes
Mind candy: no supernatural elements, just a psycho-killer in a country going to hell in a slow handbasket. (Though: "Knights of the Badge and the Gun" is a bit of an internal reference.) The protagonists are somewhat stock characters, but the story moves along well.
ROT-13'd: Ner jr _fhccbfrq_ gb srry gung Ubyyl vf nyzbfg nf zrffrq hc nf Ze. Zreprqrf, naq cbgragvnyyl nf qnatrebhf? (Abg nf pehry, ab.) Orpnhfr gung'f fher nf uryy gur vzcerffvba V jnyxrq njnl jvgu.
A. W. van der Vaart, Asymptotic Statistics
A well-written and thorough introduction to the highlights of classic statistical theory, especially as shaped by Le Cam and his followers. The aim here is to present the classic results in as streamlined, elegant and modern a manner as possible, rather than following the often-cumbersome original proofs of the 1920s--1960s. Thus for instance van der Vaart gives a wonderfully simple, but correct, account of the "delta method" (chapter 3), and of convergence of estimators based on minimizing loss functions ("M" estimators) or solving equations which set gradients to zero ("Z" estimators; both chapter 5), topics which in many other textbooks are so buried in technicalities that the main ideas are almost invisible. While the bulk of the material is on parametric inference, later chapters deal with topics like density estimation and semi-parametric regression.
In addition to Le Cam's device of relating a sequence of "experiments" (really, probability models) to a "limiting experiment", the major theoretical tool here is empirical process theory. My one pedagogical issue with the book is that empirical process theory is introduced relatively late (chapter 19), even though results from it are used much earlier (chapter 5 at the latest). Were I to teach from this book, I'd probably just move that material earlier.
Some prior acquaintance with the usual machinery of hypothesis testing, estimation, likelihood, etc., is required (Casella and Berger would be more than adequate, perhaps even All of Statistics), along with the rudiments of measure-theoretic probability. Given that background, this is the most elegant and up-to-date text on this material I've found, and would make for the core of a very good graduate course in statistical theory.
(A tangential reflection: if you go back to the literature in the 1980s and early 1990s, you can sort of see two strands of ideas about reformulating parametric inference in more advanced and systematic mathematical terms. One of these is information geometry, which draws links to differential geometry and information theory. The other strand, of which this book is a fruit, is empirical process theory. It's interesting to me that information geometry hasn't seemed to lead to much beyond more elegant formulations of the classical ideas of parametric statistical theory, while studying empirical processes has been much more productive of new results and new areas. [And I say this as someone who finds information geometry more attractive intrinsically.] Is this a fair assessment? If it is, is this contrast intrinsic to the two approaches, or just an accident of their developments?)
Disclaimer: I've never met or corresponded with Prof. van der Vaart, but he's a very eminent figure in my field, and might well referee one of my papers or grants some day (if he hasn't already). (Correction, next month: I was wrong, I wrote to ask him for copy of a paper in 2006.)
Paul Hazard, The Crisis of the European Mind, 1680--1715
This is erudite, lively, contagiously enthusiastic, and admirably broad-minded. If it has a coherent thesis it frankly eludes me. The introduction makes it sound like this period is an unprecedented rupture in European culture, while the conclusion has it being the return of the Renaissance. Then again, in places Hazard makes it seem like his theme is how European high culture learned to embrace skepticism, but in others it seems to be about how it learned to avoid skeptical conclusions, or rather avoid the Pyrrhonist suspension of all judgment. Instead, it's best read as a series of case studies, with interesting connections drawn between them, like that between Locke's philosophy and the rise of sentiment and sensation.
As usual, part of me laments the lack of quantitative comparisons (yes, this period had a lot of interest in travel and travel writings --- was it really more than the one just before? were there really more enthusiastic religious movements?). On the other hand, I suspect the materials for doing that sort of history weren't available in 1935, and that the result would be much less fun to read.
— Further comment is out-sourced to David Auerbach (with the small correction that Hazard does mention Leeuwenhoek, giving him most of p. 309).
Seanan McGuire, Pocket Apocalypse
Mind candy: in which our heroes confront an outbreak of lycanthropy in Australia. Probably not so fun if you haven't at least read the previous book in the series.
Joe Abercrombie, Half a King and Half the World
Mind candy: Viking-toned fantasy epics. There is a lot of blood and brutality and (it can't be a spoiler if it's in the book description) betrayal, but it seems much more hopeful than Abercrombie's best-known books, in that virtue is not destined to be its own punishment. I read each in one sitting, because they were just that fun, but with some trepidation on the characters' behalf. On the other hand, there's apparently at least one more book to come in the series, so there's still time for every hope to be blasted and every good thing in the characters' lives to be twisted into a burden or a mockery.
— On Half a King: I wonder if this is in some way a reaction to, or even a tribute to, the works of Lois McMaster Bujold? Yarvi, as the clever, deformed prince despised by the warriors around him, is an obvious analog for Miles, and his mother Laithlin is shown dominating by force of personality as Cordelia does, though in a rather different way. And (ROT-13'd) Lneiv'f obgu orvat fbyq vagb fynirel nf n ebjre, naq hygvzngryl fnirq guebhtu yblnygl gb uvf bne-zngrf, vfa'g n zvyyvba zvyrf njnl sebz jung unccraf gb Pnmnevy va _Phefr bs Punyvba_.
Half the World, however, seems inspired by the thought "Eowyn would have been a very difficult teenager".

Posted at March 31, 2015 23:59 | permanent link

## February 28, 2015

### Books to Read While the Algae Grow in Your Fur, February 2015

Attention conservation notice: I have no taste.

Paul J. McAuley, Something Coming Through
Mind candy science fiction: in which the aliens showed up some years in the narrative past, and offered us access to the same dozen habitable planets orbiting red dwarf stars that they had apparently offered many previous, now-vanished species. Of course, we took them up on it... The combination of bizarre ancient inhuman technologies and ecosystems with all-too-human vice, folly and social malaise proves to work about as well as you'd expect to be a fertile generator of Plot. There are also some really fine descriptive passages, and McAuley returning to a theme of some of his earlier books (e.g., Fairyland), of Europe (especially England) being gradually filled with weirdness seeping in from the margins, a weirdness driven by not-human, or no-longer-human, forces.
While this isn't one of his very best, like the Confluence trilogy or The Quiet War and its sequels, it's still really good; I read it in as close to one sitting as teaching allowed, with great enjoyment.
McAuley's self-presentations: 1, 2.
Russell A. Poldrack, Jeanette A. Mumford and Thomas E. Nichols, Handbook of Functional MRI Data Analysis
A brief (~190 pp. of main text) introduction to analyzing functional neuro-imaging data. It's not until about page 90 that they get to statistical models of the relationship between stimuli or actions and the neural signals. This is because the whole first half of the book, entirely appropriately, is about how "the data" are actually constructed.
When neurons spike, they consume energy and need oxygen. The blood vessels in the brain respond by delivering more highly-oxygenated blood to them, over-compensating, with some delay, for the original oxygen consumption. Oxygen-rich blood has slightly different magnetic properties than oxygen-poor blood. Functional MRI is able to measure changes in this "blood oxygenation level dependent" (BOLD) signal over space and time. But what we're really interested in is the neural activity, and the BOLD signal is full of noise and artifacts. To get values suitable for statistical models of neural activity out of this, we need to go through a very complicated process, where every step itself relies on a different statistical or computational model, itself resting on a more-or-less explicit theory --- about how the MRI signal is acquired, about the nature of the "hemodynamic response", about anatomical variation between people, etc., etc. The end result of all this transforming, manipulating, filtering, adjusting, distorting, and discarding is what people like me are accustomed to calling "the data". The authors are very sound on all this, including the vital importance of human quality control at every stage.
They are also pretty good on the statistical methods which come after that point, in the second half of the book, though rather conventional in their emphasis on linear regression models. The mathematical level is kept deliberately elementary, but it does a good job of clarifying some common pitfalls (circularity, a.k.a. "voodoo correlations"; in-sample evaluations of predictive power). Neither half of the book gives all the technical details which a complete newbie would need, but both are quite good at both painting the big picture and including crucial details. I strongly recommend it as a first book for anyone seriously contemplating working with fMRI data.
Disclaimer: I know both Poldrack and Nichols professionally, and correspond with them occasionally; this led to my picking up the book, but isn't, I think, enough to make me give it a positive review.
Laura Kipnis, Against Love
There are a few passages of genuinely good and insightful writing here. (The catalog of things people in couples are not allowed to do, for instance, is simultaneously hilarious and cringe-inducing.) That said... By intellectual conviction, family history and personal temperament I'm pre-disposed to doubt that contemporary American serial monogamy is the optimal way of arranging the sexual, family and emotional lives of the East African Plains Ape. But most of Kipnis's book is, so help me, uncritically recycled "critical theory", primarily Marcuse and Foucault, complete with the "if X is the Y of Z, then U is the V of W" stylistic tic. Kipnis's addition to the Masters of '68 is to double down on the assertion that there is some sort of mutually supportive relationship between serial monogamy and capitalism. It would be a tricky exercise in both rhetoric and logic to work out whether Kipnis's text puts this forward as a mere series of similes, which neither require evidence nor possess implications, or as a piece of functionalism bordering on a conspiracy theory. Worse, the masses of such argufmentation overwhelm the funny, pointed and/or insightful bits.
I was entirely in the mood, this Valentine's Day, to read a polemic against our contemporary ideas of love; I wish it had been good. I am a bit sad because I suspect it could have been good, if only Kipnis had thrown out a whole bunch of books from graduate school.
Disclaimer: Kipnis is three degrees of separation from me in blog space, so I may be motivated to go easy on her book.
Charles Stross, Rule 34
More near-future techno-financial crime in Edinburgh. It says bad things about me that I found myself wanting to channel some of the Toymaker's rants during office hours.
Ben Aaronovitch, Foxglove Summer
In which our London-born-and-bred hero confronts the alien horrors of the English countryside. (Given the setting, I kept half-expecting a cameo from the Rev. Ms. Merrily Watkins.) Very enjoyable, and in terms of plot fairly distinct from previous books, but I'm pretty sure it's not the place to enter into the series.

Posted at February 28, 2015 23:59 | permanent link

## January 31, 2015

### Books to Read While the Algae Grow in Your Fur, January 2015

Attention conservation notice: I have no taste.

Jin Feng and Thomas G. Kurtz, Large Deviations for Stochastic Processes
Kurtz is best known for work in the 1970s and 1980s on how sequences of Markov processes converge on a limiting Markov process, especially on how they converge on a limiting deterministic dynamical system. This book is an extension of those ideas, and best appreciated in that context.
As every school-child knows, we ordinarily specify a Markov process by its transition probabilities or transition kernels, say $\kappa_t(x,B) = \Pr(X_{t}\in B \mid X_0=x) ~.$ The transition kernels form a semi-group: $\kappa_{t+s}(x,B) = \int{\kappa_t(y,B) \kappa_s(x,dy)}$. For analytical purposes, however, it is more convenient to talk about transition operators, which give us conditional expectations: $T_t f(x) = \Expect{f(X_{t})\mid X_0=x} = \int{f(y)\kappa_t(x,dy)} ~.$ (It's less obvious, but if we're given all the transition operators, they fix the kernels.) These, too, form a semi-group: $T_{t+s} f(x) = T_t T_s f(x)$. The generator $A$ of the semi-group $T_t$ is, basically, the time-derivative of the transition operators, the limit $A f(x) = \lim_{t\rightarrow 0}{\frac{T_t f(x) - f(x)}{t}} ~.$ so $\frac{d}{dt}{T_t f} = A T_t f$ A more precise statement, however, which explains the name "generator", is that $T_t f = e^{t A}f = \sum_{m=0}^{\infty}{\frac{(tA)^m f}{m!}} ~.$ Notice that the transition operators and their generator are all linear operators, no matter how nonlinear the state-to-state transitions of the Markov process may be. Also notice that a deterministic dynamical system has a perfectly decent transition operator: writing $g(x,t)$ for the trajectory beginning at $x$ at time $h$, $T_t f(x) = f(g(x,t))$, and $A f(x) = {\left.\frac{d T_t f(x)}{dt}\right|}_{t=0} ~.$
Suppose we have a sequence of Markov processes, $X^{(1)}, X^{(2)}, \ldots$. What Kurtz and others showed is that these converge in distribution to a limiting process $X$ when their semi-groups $T^{(n)}_h$ converges to the limiting semi-group $T_h$. This in turn happens when the generators $A^{(n)}$ converge on the limiting generator $A$. To appreciate why this is natural, remember that a sequence of distributions $P^{(n)}$ converges on a limiting distribution $P$ if and only if $\int{f(x) dP^{(n)}(x)} \rightarrow \int{f(x) dP(x)}$ for all bounded and continuous "test" functions $f$; and $A^{(n)}$ and $A$ generate the semi-groups which give us conditional expectations. (Of course, actually proving a "natural" assertion is what separates real math from mere hopes.) In saying this, naturally, I gloss over lots of qualifications and regularity conditions, but this is the essence of the thing. In particular, such results give conditions under which Markov processes converge on a deterministic dynamical system, such as an ordinary differential equation. Essentially, the limiting generator $A$ should be the differential operator which'd go along with the ODE. These results are laws of large numbers for sequences of Markov processes, showing how they approach a deterministic limit as the fluctuations shrink.
Large deviations theory, as I've said elsewhere, tries to make laws of large numbers quantitative. The laws say that fluctuations around the deterministic limit decay to zero; large deviations theory gives an asymptotic bound on these fluctuations. Roughly speaking, a sequence of random variables or processes $X^{(n)}$ obeys the large deviations principle when $n^{-1} \log{\Prob{X^{(n)} \in B}} = -\inf_{x \in B}{I(x)}$ for some well-behaved "rate function" $I$. (Again, I gloss over some subtleties about the distinctions between open and closed sets.) The subject of this book is, depending on your point of view, either strengthening Kurtz's previous work on convergence of Markov processes to large deviations, or extending the large deviations theory of stochastic processes, as the title says.
In dealing with large deviations, it's very common to have to deal with cumulant generating functions. The reason is a basic approximation which goes back to Harald Cramér in the 1930s. Start with the Markov inequality: for a non-negative random variable $X$, $\Prob{X \geq a} \leq \Expect{X}/a$. Since $e^{nx}$ is an increasing function, when $t > 0$, and non-negative, $\Prob{X\geq a} = \Prob{e^{tX} \geq e^{ta}} \leq e^{-ta} \Expect{e^{tX}} ~ .$ Now take the log: $\log{\Prob{X\geq a}} \leq -ta + \log{\Expect{e^{tX}}}$ Since this is true for all $t$, $\begin{eqnarray*} \log{\Prob{X\geq a}} & \leq & \inf_{t}{-ta + \log{\Expect{e^{tX}}}}\\ & =& -\sup_{t}{ta - \log{\Expect{e^{tX}}}} \end{eqnarray*}$ $\log{\Expect{e^{tX}}}$ is precisely the cumulant generating function of $X$, and $\sup_{t}{ta - \log{\Expect{e^{tX}}}}$ is its Legendre transform.

Now imagine a sequence of variables $X_1, X_2, \ldots$, where $X_n = n^{-1}\sum_{i=1}^{n}{Y_i}$, with the $Y_i$ being (for simplicity) IID. Then we have a very parallel calculation which gives an exponentially shrinking probability: $\begin{eqnarray*} \Prob{X_n \geq a} & = & \Prob{\sum_{i=1}^{n}{Y_i} \geq na}\\ \Prob{X_n \geq a} & \leq & e^{-nta}{\left(\Expect{e^{tY_1}}\right)}^n\\ n^{-1}\log{\Prob{X_n \geq a}} &\leq & -\sup_{t}{ta - \log{\Expect{e^{tY_1}}}} \end{eqnarray*}$ Of course, there is still the matter of getting the matching lower bound, which I won't go into here, but is attainable.

More generally, beyond just looking at the mean, one defines $\Lambda(f) = \lim_{n\rightarrow\infty}{n^{-1} \log{\Expect{e^{nf(X_n)}}}}$, and the large deviations rate function $I(x)$ is generally $\sup_{f}{f(x) - \Lambda(f)}$, taking the supremum over all bounded, continuous functions.
Accordingly, for each process $X_n$ we define operators which give us conditional cumulant functions: $V_n(t)f(x) = n^{-1} \log{\Expect{e^{nf(X_n(t))} \mid X_n(0)=x}}$ For fixed $n$ and $t$, $V_n(t)$ is a nonlinear operator (that is, nonlinear in $f$), but they still form a semi-group, $V_n(t+s) = V_n(t) V_n(s)$. $\begin{eqnarray*} V_n(t) V_n(s) f(x) &= & n^{-1}\log{\Expect{e^{n n^{-1}\log{\Expect{e^{nf(X_n(s))}\mid X_n(0)=X_n(t)}}}\mid X_n(0)=x}}\\ & = & n^{-1}\log{\Expect{\Expect{e^{nf(X_n(s))}\mid X_n(0)=X_n(t)}\mid X_n(0)=x}}\\ & = & n^{-1}\log{\Expect{e^{nf(X_n(s+t))} \mid X_n(0)=x}} = V_n(s+t) f(x) \end{eqnarray*}$
When we have a sequence of processes, they give rise to a sequence of $V_n$ semi-groups. When that sequence of semi-groups converges to some limiting semi-group $V(t)$, the sequence of processes obeys a large deviations principle, and the rate function follows a somewhat complicated expression involving the limiting operators $V(t)$. Since we're talking about a probability measure over stochastic processes, the basic "points" are now trajectories, functions of time $x(t)$, and the rate function is a rather complicated expression that basically involves taking a path integral of the conditional cumulant generating functions and so $V(t)$. Matters become drastically simplified if we introduce what is, at least formally, the generator of the $V_n$ group, $\frac{d}{dt} V_n(t) f = H_n V_n(t) f$ or more explicitly $H_n f = n^{-1} e^{-nf} A_n e^{nf}$
Convergence of the $H_n$ operators to a limiting operator $H$ is now the key part of showing large deviations for the processes $H_n$, and the rate function can be written in terms of the limiting operator $H$.
$H f(x)$ can typically be written as a function of $x$, $f(x)$ and $\nabla f(x)$ (or perhaps also higher derivatives of $f$). Similarly, the generator of the transitions, $A f(x)$ can often be written in terms of $x$, $f(x)$ and the derivatives of $f$. Bundle everything other than $x$ up into a variable $u$; then we can often write $H f(x) = \sup_{u} {A(x,u) - L(x,u) }$ for some operator $L$. The reason for going through all these gyrations is that then the large-deviations rate function takes a wonderfully simple form: $I(x) = \inf_{u \in \mathcal{J}(x)}\int_{t=0}^{t=\infty}{L(x(t),u(t)) dt}$ where $\mathcal{J}(x)$ consists of all functions $u(t)$ such that, for all reasonable $f$, $f(x(t)) - f(x(0)) = \int_{s=0}^{t}{Af(x(s), u(s)) ds}$
One can think of the limiting generator $A$ as saying what the derivative of the trajectory ought, in some sense, to be, and $L$ measuring how big a perturbation or disturbance has to be applied to drive the actual trajectory away from this easiest and most probable path. The rate function $I(x)$ then indicated the least amount of control that has to be applied to drive the process along the desired trajectory $x(t)$. ("Improbable events happen in the most probable possible way.") In fact, one can often simplify down to $I(x) = \int_{t=0}^{t=\infty}{L(x(t), \dot{x}(t)) dt}$
Much of what I've just sketched were already established results in the literature. What's new in this book is two big things: a powerful set of methods for verifying that the conditional cumulant operators $H_n$ converge on a limiting $H$, and another set of tools for writing $H$ in variational form. The former is the machinery of "viscosity solutions", which are weaker than the ordinary notion of solutions of differential equations. The latter brings in the machinery of control theory, or indeed of classical mechanics — as the notation suggests, $H$ is (like) a Hamiltonian, and $L$ (like) the corresponding Lagrangian. (The large deviations principle is thus a principle of least action, or perhaps vice versa, as suggested some time ago by, e.g., Eyink.)
The power of the Feng-Kurtz approach is considerable. Basically the whole of the classic Friedlin-Wentzell theory of ODEs perturbed by small-amplitude Gaussian noise falls out as an easy corollary. Moreover, the Feng-Kurtz theory extends to much more difficult cases, outside the reach of the Friedlin-Wentzell apprach, such as infinite-dimensional interacting particle systems. While it is not for the mathematically inadept, it's extremely rewarding, and, I might add, probably about as well-written as such a book could possibly be. Chapter 1 describes the main lines of the approach, gives an extensive catalog of examples with explicitly heuristic reasoning, and then describes the difficulties in the way of turning the heuristics into actual theorems. Chapter 2 is a sketch of the main definitions and results that will be taken in the rest of the book. Chapters 3--4 review general concepts of process-level large deviations, specialized in chapter 5 to Markov processes and convergence of conditional cumulants ("classical semigroup approach"). Chapters 6 and 7 introduce viscosity solutions and their associated machinery. Chapter 8, in many ways the heart of the book, relates the limiting conditional cumulant semi-groups to control problems, and derives variational formulae for rate functions. The remaining chapters are devoted to giving rigorous versions of the examples from chapter 1.
Obviously, from the space I've devoted to this, I think this work is incredibly cool. I recommend it to anyone with the necessary math, and a serious interest in large deviations, in Markov processes, or the emergence of macroscopic regularities from microscopic interactions.
Charles Stross, Halting State
Mind candy science fiction, at the intersection of near-future police procedural, for-profit central banking, and online role-playing games. It gained quite a bit as an audiobook through being read by a Scots actor, one Robert Ian MacKenzie (though I had to re-listen to some bits to figure out what was being said). — Sequel.
Stephen José Hanson and Martin Bunzl (eds.), Foundational Issues in Human Brain Mapping
Like all paper collections, mixed. There is actually a certain amount of dialogue here, which is unusual and speaks highly of the organizers, and the papers are mostly (*) of high quality. The first three chapters are an exchange about the use of "functional localizers" (Friston et al. con, Saxe et al. pro); chapters 6--8 another on the related topic of "non-independence" (Vul and Kanwisher are against it, others try to salvage something); many of them (e.g., Poldrack's ch. 13) deal with the troublesome idea of "pure insertion", that it makes sense to just add one psychological process in to a task or setting, leaving the others undisturbed; many of them (among others, ch. 13 again, and chs. 17--19) revolve around the question of how much of psychology we need to get right, or can get right, before fMRI results become anything more than blinkenlights. Of the non-controversial chapters, I would highlight Haxby on multivariate methods, Poline et al. on inter-subject variability, and Hanson and Glymour on causal inference (though something has gone horribly wrong with the figures and captions in that last chapter).
Disclaimer: While I don't think I've met either editor, I've met at least two of the contributors at conferences, corresponded with and met another, and owe considerable intellectual and professional debts to a fourth. Writing a really negative review of this collection might've been awkward for me, but it would've been easy to say nothing at all.
*: I can't help pointing out the conspicuous exception of chapter 12, which is so transparently a recycled grant proposal which that it still refers to itself as "this proposal" [e.g., pp. 136 and 142.].
Jeffrey Robinson, BitCon: The Naked Truth about BitCoin
I picked this up on the recommendation of Izabella Kaminska. It's decent, if very pedestrian, when it comes to basic facts and figures, including the trivial scope of bitcoin and its deep unsuitability for many tasks. It is also a shrewd point that most (all?) of the mainstream companies which "accept bitcoin" really funnel on-line customers to services which convert bitcoin to dollars — so they really accept dollars, and are not (as the jargon goes) exposed to the risk of changes in the dollar price of bitcoins.
But I found the book quite dis-satisfying in at least three ways. First, the writing: Robinson rambles, and is never very good at either telling a consecutive story or assembling a cumulative argument. Second, it lacks technical depth; I get the impression that Robinson frankly does not understand the bitcoin protocol. He doesn't even really try to explain it, which of course makes it impossible for him to convey why it's technically impressive, why it might have uses other than for a weird medium of exchange, what makes a "50%+1" attack on the protocol possible, or even what miners actually do. (There are proofs by example that explaining such things is quite possible.) He also doesn't explain any of the usual economics of money, like how fractional reserve banking works and why deflation is dangerous (topics old as the hills), the network economies of scale of money (a bit younger than the hills), or why states always play such a central role in currencies, relevant though those topics are. (I'm tempted to say that Halting State, which is after all just a novel, is actually clearer on the relevant economics, but that may be unfair.) Third, he really does not write to convince; it's a mixture of facts and figures, rhetorical incredulity, and appeals to authority. His dis-inclination to concede anything to "the Faithful" is such that even when they have a sensible point to make (as, e.g., that the valuation of gold is overwhelmingly much more due to a shared, self-perpetuating social convention than the actual demand for jewelry, electronics, dental fillings and perpetual coffee filters), he refuses to concede it or even to present the counter-arguments. In consequence, I very much doubt he would persuade any reasonably skeptical teenager. (It will also not be as effective stirring such a teenager's feelings of contempt as getting them to read Shit r/Bitcoin/ says.) All of which is a shame, because bitcoin has been, at best, an idiotic waste of time and energy, and a good debunking book would be a valuable thing, which would have much to say about our time.
D. J. Finney, Experimental Design and Its Statistical Basis
Classic Fisherian experimental design theory, adapted to the meanest biological understanding of 1952. But it's very clear (I'm not ashamed to say I don't think I fully grasped fractional factorial designs before reading this), and most of the advice remains relevant, though there are of course now-quaint passages (*).
*: "For most experiments, the labor of statistical analysis is small relative to the total cost.... [T]he choice of a design should be scarcely influenced by consideration of whether or not the results will be easy to analyze. The symmetry of a well-designed experiment usual insures that the analysis is not excessively laborious... The attitude... of an experimenter who must do his own computing and who has no calculating machine will naturally differ from that of one in an organization with a well-equipped computing section. In any field of biology in which extensive numerical records are obtained, a calculating machine is an investment whose small cost is rapidly repaid by the saving of time, the increase in accuracy, and the practicability of computations previously thought prohibitively laborious that its use makes possible. A machine should be regarded as an indispensable adjunct to quantitative biological research, and an operator especially skilled in its use is an obvious economy if the volume of computing is large. This point is quite distinct from that of employing a statistician, and much research would benefit from the realization that a more systematic approach to its computations need not await the appointment of a statistician. Nevertheless, any biologist who has read this far will realize that he also needs access to the advice of a statistical specialist if he is to make the best use of modern ideas in experimental design." (pp. 160--161) It's worth remarking that, first, I think all of the computation done in the book involves fewer operations than what my phone does to play a medium-length song, and second, it'd still be good advice to make sure someone in your lab knows a little coding, even if you can't hire a statistician.
Harry Connolly, The Way into Chaos, The Way into Magic and The Way into Darkness
Mind candy fantasy epic, but of high quality. A basically bronze-age city has parlayed its control of magic, bestowed by visitors from another dimension in exchange for art, into a continental empire, complete with things like clean water, medicine that works, and steel. Then there's an outbreak from the dungeon dimensions, effectively decapitating the empire, and naturally the subordinate kings decide it's more important to fight over the scraps than deal with the outbreakees. Then things get really bad. It's told engagingly and at a great pace, but also with thought to the consequences of the speculative premises, and an almost Cherryhan sense of politics.
Queries involving spoilers (ROT-13'd): Nz V penml, be qbrf gur ynfg cneg bs gur ynfg obbx onfvpnyyl fnl gur Terng Jnl vf Lbt-Fbgubgu? Naq qba'g bhe urebrf zngpu hc jvgu gur gjb uhzna tbqf juvpu gur Terng Jnl qbrfa'g, ng yrnfg ng svefg, erpbtavmr?
Gordon S. Wood, The American Revolution: A History
A brief history, emphasizing changes in ideology and attitudes, from monarchy through republicanism to democracy. The whole of the revolutionary war is confined to one chapter, which was personally annoying since that was what I most wanted to learn more about, though I guess defensible. I also found it a bit frustrating that Wood doesn't consider what seem to me obvious comparative questions, and just takes certain institutional facts for granted. The comparative question: why did the 13 colonies which became the United States follow such a different course than did the British colonies which became Canada? On the institutional front: Wood claims that agitation against the British government's new taxes, and its efforts to govern directly more generally, was organized through local associations of people in the same trade, or for local improvements, or volunteer public services like fire-fighting. Where did those groups come from? Had they a history of being involved in political questions? If not, what tipped them over into opposition to these measures? Were there no volunteer fire departments in Newfoundland? Etc.
Despite my quibbling, I learned a great deal from this, which means I'm in no position to evaluate its accuracy or scholarly adequacy.
John Julius Norwich, Absolute Monarchs: A History of the Papacy
A very readable history of a rather old-fashioned kings-and-battles kind, focused on the persons of the Popes and their scheming and fighting with other European potentates. The papacy, as an institution which has changed over time, gets much less attention. (*) Admittedly, that would have made for a much longer, and less popular, book. It over-laps less with The Bad Popes than my prejudices would have suggested.
*: E.g., what did the popes actually do, day to day, in the early middle ages, or the high Renaissance, or the 19th century? How was their time divided between ruling Rome (or Avignon), administering the Church hierarchy (which meant doing what?), religious offices, and nepotism and intrigue? Just who were those around them who enabled them to do these things, who made the papacy an effective, or ineffective, power in the world? Etc.
Thanks to John S. for giving me access to Wood and Norwich.

Posted at January 31, 2015 23:59 | permanent link