## December 31, 2015

### Books to Read While the Algae Grow in Your Fur, December 2015

Attention conservation notice: I have no taste. Also, this month when I wasn't reading textbooks on regression, I was doped to the gills on a mixture of binged TV shows, serial audio fiction and flu medicine.

Michael H. Kutner, Chris J. Nachtsheim and John Neter, Applied Linear Regression Models
J. J. Faraway, Linear Models with R
Sanford Weisberg, Applied Linear Regression
Having taught undergraduate linear regression for the first time this year, I had to pick a textbook, which meant reading a lot of them. These were the three I made it through cover to cover. Kutner et al. (henceforth KNW) is the one previously assigned for the class, and which we ended up keeping for reasons of continuity. Farawy's and Weisberg's were optional.
I have to say that over the course of the semester I came to really dislike KNW. The mathematical level is very low — I don't think anyone could read it and come away with any notion of why the numerator and the denominator in an $F$ test statistic are independent, or even that an $F$ test is a specialization of a likelihood ratio test. Which, OK, there's room for regression textbooks which aren't deeply into probability. But it has a most unhelpful devotion to things which were never more than kludges adapted to the computing hardware of 1950 or even 1920, like ANOVA tables, and endless attention to transformations which try to make things look more Gaussian and/or additive, never mind what they do to the interpretations. Against this, there are literally four pages on the bootstrap, and while leave-one-out cross-validation is mentioned, multi-fold CV isn't. It's almost as though the last forty years of statistics never happened. This in turn makes the explanation of regression trees fitting (another four pages, including examples) totally obscure. Now, I am sure that KNW know all this stuff perfectly well, but they don't teach it, at least not here, and I can't begin to fathom why. Even the data examples are small and antiquated and often just weak. (Who tries to predict house prices without information on location? Seriously, who?)
Both Faraway and Weisberg are superior in several ways: neither goes far into probability, but they move much faster, they are more up to date about things like Gaussianity, ANOVA tables, non-linear models, etc. (*), their computing is better, and their examples are more serious. Faraway has more material on shrinkage estimators (ridge regression, lasso) than Weisberg, and several chapters on experimental design, which Weisberg hardly touches on. On the other hand, Weisberg does have a more gentle opening with material on scatterplots and on "simple" regression (i.e., with one predictor variable). At least with undergrads, starting soft like that is probably a good idea.
None of the three books has an adequate discussion of causal inference, though again Faraway has the most; at least none of them say anything actively harmful on the matter. All three put model diagnostics after parametric inference within the model, which I realize is the traditional order but makes little sense — why bother testing whether such-and-such a slope is exactly zero if the model is rubbish in the first place? (**)
All three are outrageously priced, with KNW being by far the worst. (When the Revolution comes, Big Textbook won't be the first up against the wall, but they'll get a low number***.)
Clearly, I do not recommend KNW for self-study, though either Faraway or Weisberg should be fine. I would need a truly compelling reason to assign KNW again in the future. I would be happy to use either Faraway or Weisberg, leaning towards the former.
*: E.g., Weisberg (sec. 9.3, p. 204): "The assumption of normal errors plays only a minor role in regression analysis. It is needed primarily for inference with small samples, and even then the bootstrap ... can be used for inference. Furthermore, nonnormality of the unosbervable errors is very difficult to diagnose in small samples by examination of residuals." Or Faraway (sec. 3.2, p. 35): "It is not really necessary to specifically compute all the elements of the [ANOVA] table. As the originator of the table, Fisher said in 1931, it is 'nothing but a convenient way of arranging the arithmetic.' Since he had to do his calculations by hand, the table served a necessary purpose, but is not essential now."
**: Yes, there are circumstances where one might be interested in testing hypotheses about the best linear approximation, using a fixed set of variables, to the true regression. But then you couldn't test procedures which assume there's no approximation error! (Cf.)
***: Speaking of Big Textbook, my remarks are all about the 3rd edition of Weisberg, not the 4th edition, which I haven't seen. Some of the "new features" advertised by the publisher for the update, like covering the bootstrap, are actually in the 3rd edition.
Disclaimer: I am, supposedly, finishing my own textbook on statistics. But that book very deliberately presupposes a reader who has already gone through a course in linear regression, so it's not directly in competition with any of these.
J. Richard Büchi, Finite Automata, Their Algebras and Grammars: Towards a Theory of Formal Expressions (ed. Dirk Siefkes)
A presentation of automata theory, especially the theory of finite automata, as a branch of abstract algebra. This was a manuscript left incomplete at the time of Büchi's death, edited, but not completed, by Siefkes (e.g., there are references to never-written sections). While the writing is a bit self-indulgently opinionated (*), the interplay between algebraic and automata-theoretic ideas is very good. Though it's probably not a useful introduction to either abstract algebra or abstract automata, this would have been really good for me to have read in graduate school, and might still be helpful for some on-going projects.
*: File that under "takes one to know one".
Max Gladstone, Margaret Dunlap, Mur Lafferty and Brian Francis Slattery, Bookburners
Mind candy contemporary fantasy, in serial form, i.e., weekly installments of about 20--30 pages each. It's enjoyable, and also an interesting experiment with replicating in prose the TV-show structure of mixing forbidden-tome-of-the-week episodes with ones that advance a larger-scale plot. I liked the writing enough to track down other books by all four authors.
The Black Tapes
Imagine an NPR affiliate deciding to run a series about a skeptical paranormal investigator's unsolved cases. Now make the stories much creepier than whatever you were imagining, and let the stories start to over-lap disquietingly as the season progresses.... (Also, make the voice of the reporter something I can actually stand to listen to, unlike the usual NPR voices, which I find only slightly more pleasant than the sound of nails on a chalkboard.) The first season, all their currently is, ends on a cliff-hanger, but more is promised in January.
Welcome to Night Value
Mentioning The Black Tapes reminds me that I have been meaning to plug this podcast for years. (I believe I first found it through Kate Nepveu.) Each episode is, supposedly, about 25 minutes of community radio from the small high-desert American town of Night Vale. Night Vale is a town full of high school football stars, vague yet menacing government agencies, hipster record stores, hooded figures, dog parks, sheriff's secret policemen, school boards dominated by sentient glowing clouds (ALL HAIL), librarians, smiling gods, teenage girls with an intense devotion to literature, opera houses, miniature civilizations found beneath bowling lanes, condos, unsupported old oak doors appearing out of nowhere, etc., etc., etc. Some of the jokes build for months before coming to the pay-off. It's very much my sort of thing (I am not sure if the friend for whom I bought the "if you see something, say nothing and drink to forget" flask has quite forgiven me), even though I find the musical interludes (a.k.a. "the weather") almost uniformly forgettable.
Samuel Bowles, The New Economics of Inequality and Redistribution (in collaboration with Christina Fong, Herbert Gintis, Arjun Jayadev, and Ugo Pagano)
A short little collection of lectures (180 pages including preface and math-y appendices), drawing on Bowles's papers from the 2000s. (Hence the long list of collaborators.) The Big Idea here is that egalitarianism has got a lot more room for maneuver than the current conventional wisdom and economics (to the extent those are different) holds. "If I had to do a bumper sticker for the new economics of inequality it would be: INEQUALITY: IT DOESN'T WORK AND PEOPLE DON'T LIKE IT" (p. xiii).
The "people don't like it" part is the work of Bowles and collaborators on strong reciprocity, which I've gone on about at excessive length for many years.
My remarks on the "it doesn't work part" have grown excessive, so I'll try to spin that off into a post on its own.
ObDisclaimer: Sam is an acquaintance of long standing, both of us being affiliated with Santa Fe.
Person of Interest
Murder-of-the-week mind candy, with a flavoring of pre-crime. That the makers of the Machine had, apparently, never encountered the concept "false positive" is only too realistic, given what we know of how the national surveillance state thinks, but a few of them would have improved the show considerably.
The Librarians
Mind candy TV, adorable contemporary fantasy division.
(Somebody must have written a good essay on the way American pop culture tries to assimilate any sort of voluntary community to a nuclear family: who?)
Kathleen George, Simple
Mind candy mystery, continuing her series set in Pittsburgh; this time it's a politically motivated murder. (I refuse to call that a spoiler.) The characterization remains very good, and the jail scenes are convincing. The you-are-there details all concern neighborhoods I know very well, and are absolutely on target.
Lila Bowen, Wake of Vultures
Mind candy, young-adult western fantasy division. Enjoyable enough that I will keep an eye out for the inevitable sequel. (It's also a sign of the times that the protagonist of what is, in fact, a classically-formed western for young adults can be a part-Indian, part-black, bi-sexual girl raised in near-slavery who wants to be a boy.)