January 31, 2016

Books to Read While the Algae Grow in Your Fur, January 2016

Attention conservation notice: I have no taste, and no qualifications to opine on the histories of the First World War, of Antarctic exploration and science, or of political economy. (I actually do have some qualifications to opine on statistical theory and collective cognition, but who wants to read about that?)

Mark Thompson, The White War: Life and Death on the Italian Front, 1915--1919
A well-told narrative history of the war, mostly from the Italian side. He covers all aspects, from the back-and-forth of the twelve (!) battles of the Isonzo and diplomatic machinations to war literature and the cults of vitalism and "mystical sadism". One of my great-grandfathers was an engineer in the Italian army during this, and a vague tradition of a grossly incompetent, futile conflict had come down to me, but before reading this I had no idea of just how bad it was. Or just how much the Italian state's conduct of the war helped set the stage for Fascism.
Tremontaine
Mind candy: a "fantasy of manners", combining spherical trigonometry, the chocolate trade, aristocratic intrigue, and the authors toying with the characters' affections. It's a prequel, of sorts, to Ellen Kushner's Swordspoint, which I read long enough ago that I remember only a vague atmosphere.
Apsley Cherry-Garrard, The Worst Journey in the World
A deservedly-classic memoir of the British Antarctic expedition of 1910--1913. The writing is vivid, the conditions described are alternately wonderous and appalling (admittedly, much more appalling than wondrous), and the feats of physical endurance and stoicism remarkable. What's even more astonishing, now, is the sheer futility of it all. "We were primarily a great scientific expedition, with the Pole as our bait for public support, though it was not more important than any other acre of the plateau": but that publicity stunt killed five people, and thoroughly set the agenda for all the rest of the expedition. Or take Cherry-Garrard's titular "worst journey in the world", over a month on foot through the darkness of an Antarctic winter, at temperatures up to a hundred degrees Fahrenheit below freezing; by rights it should have killed the three people who attempted it, and nearly did many times over. It had more of a scientific purpose, namely to collect embryos of the Emperor penguin, but that goal was itself based on a thoroughly bad theory, that the penguins are the most "primitive" of birds, and "If penguins are primitive, it is rational to infer that the most primitive penguin is farthest south". When we talk about science advancing funeral by funeral, this is not what we have in mind.
Near the end of the book, Cherry-Garrard makes a rousing call for creating, and funding, a proper scientific presence in Antarctica; whether this helped lead to the modern British Antarctic Survey I don't know, but I'd like to think so.
There is an essay to be written about the anxieties about British masculinity and national degeneration on display in the opening and concluding chapters. There's another essay to be written about this as a source text for At the Mountains of Madness, for everything from the Antarctic crinoids to the fusion of doomed science and masculinity (*). Probably both of these essays have been written. They'd be worth writing because this is a great book.
ObLinkage: Maciej "Idle Words" Ceglowski, Scott and Scurvy
*: "Poor devils! After all, they were not evil things of their kind. They were the men of another age and another order of being. Nature had played a hellish jest on them .... [P]oor Old Ones! Scientists to the last — what had they done that we would not have done in their place? God, what intelligence and persistence! What a facing of the incredible, just as those carven kinsmen and forbears had faced things only a little less incredible! Radiates, vegetables, monstrosities, star-spawn — whatever they had been, they were men!" (At the Mountains of Madness, Chapter 11)
Warren Ellis and Gianluca Pagliarani, Ignition City
Warren Ellis, Declan Shalevy and Jordie Bellaire, Injection, vol. 1
Comic book mind candy. Ignition City is Ellis playing around with space opera of the old Flash Gordon / Buck Rogers mold; it's fun but not much more. Injection goes deeper, to some place where worries that the future is coming at us too fast and that we've somehow used up our ability as a culture to come up with anything new and just recycle old fads meets weird little bits of British folklore, and sets off explosions. Also, it's gorgeously drawn.
Veronika Meduna, Secrets of the Ice: Antarctica's Clues to Climate, the Universe, and the Limits of Life (a.k.a. Science on Ice: Discovering the Secrets of Antarctica)
Well-written and serious (but not solemn) popular book about scientific research in Antarctica, especially by scientists from or working in New Zealand, accompanied by tons of beautiful photos. My one complaint is that I kept wanting to know more.
A. J. Lee, $U$-Statistics: Theory and Practice
Suppose we want to estimate some attribute $\theta$ of a probability distribution $F$, and we have available samples $X_1, X_2, \ldots X_n$ drawn iidly from $F$. A fundamental theorem of Halmos's says that $\theta(F)$ has an unbiased estimator iff $\theta(F) = \mathbb{E}_{F}[\psi(X_1, X_2, \ldots X_k)]$ for some function $\psi$ of $k$ variables. A natural estimator would then be $\psi(X_1, X_2, \ldots X_k)$. But another unbiased estimator would be $\psi(X_{n-k+1}, X_{n-k+2}, \ldots X_n)$, and so forth; a natural impulse is to reduce the variance by averaging such estimates together. Furthermore, since the $X_i$ are IID, it shouldn't matter what order we take them in, so a good estimator should be symmetric in its arguments. (Said differently, the order statistics are always sufficient statistics for an IID sample.)
The $U$ statistic corresponding to a symmetric kernel function $\psi$ of order $k$ is \[ U_n \equiv {n \choose k}^{-1} \sum_{i \in (n,k)} {\psi(X_{i_1}, \ldots X_{i_k})} \] where $(n,k)$ runs over all ways of picking $k$ distinct indices from $1:n$. If the space of distributions we're working with is not too small, then $U_n$ is the unique estimator of the corresponding $\theta(F)$ that is both symmetric and unbiased. Moreover, $U_n$ has the minimum variance among all unbiased estimators. (If the original $\psi$ was not symmetric, we can always replace it with a symmetrized version which gives the same $\theta(F)$.) Thus the basic sort of $U$ statistic. Variants include not summing over all possible $k$-tuples ("incomplete" $U$-statistics), multi-sample $U$-statistics, dependent observations, etc.
Unbiasedness is not, in itself, a terribly interesting property; unbiased estimators might, for instance, fail to converge. What matters more is that lots of very natural parameters or functionals can be cast in this form. (I was lead to pick this book up because, for a paper, I needed to know about how closely the actual the number of edges between two kinds of node in a network would approximate its expectation.) The terms in the average in a $U$ statistic are dependent on each other, because they share arguments, e.g., $\psi(X_1, X_2)$ will be statistically dependent on $\psi(X_1, X_3)$ and $\psi(X_2, X_{14})$. But this dependence has a nice combinatorial structure, which lets us re-write the $U$ statistic as a sum of uncorrelated terms (the "$H$-decomposition" or "$H$-projection"). The 0th order term in this decomposition is just $\theta$; the first-order corrections are functions of the individual $X_i$ (and so IID); the 2nd order corrections are symmetric functions of pairs of $X_i$s, and so forth. Since the higher-order terms in this expansion are generally of smaller order than the earlier ones, this lets us give systematic approximation formulas for things like the variance of a $U$ statistic, and in general to port over much of the ordinary IID limit theory without too much trouble.
This book is a good tour of the state of the statistical theory as of 1990. The first chapter covers the most basic facts about $U$ statistics, rather as I've done above. The second chapter deals with variations (including, beyond those I've mentioned, independent but not identically distributed data, sampling from a finite population, and weighting terms in the sum). Chapter 3 covers asymptotics, emphasizing situations where IID methods and results carry over. Chapter 4 covers further generalizations, such as symmetric statistics which are not $U$ statistics. Chapter 5 is about getting standard errors using the jackknife and the bootstrap. Finally, chapter 6 covers applications beyond those already given as examples in earlier chapters, such as testing distributions for symmetry, and testing pairs of random variables for statistical independence. The writing is clear, the organization is logical, complicated or lengthy proofs get preliminary sketches, and the references are extensive.
Lee's book is a generation old; as such it looks mostly at the classical part of the theory, from its origins in the 1940s to when Lee was writing, which, like most statistical theory of the period, emphasized asymptotics. (All I needed were those asymptotics.) Since then, people in statistical learning have gotten very interested in $U$ statistics because of their relationship to ranking problems. This recent work, however, has emphasized non-asymptotic, finite-$n$ concentration results which simply weren't on Lee's horizon. I don't know that literature well enough to say whether there's a more comprehensive replacement for this book.
Karl Marx, Capital: A Critique of Political Economy, vol. I: The Process of Capitalist Production
I read this as a teenager, but the other day, one of the occasional used-book dealers who comes by campus had, in addition to the usual collection of novels from the 1970s and Dover books on mathematics, a stout little ex-library hardback of Capital, volume I. It was (perhaps appropriately) virtually free, and so I found myself moved to buy it, and then to re-read it. My reading notes have grown to other 7000 words, so I'll make them their own post when they're done.
I will say three things: (1) This could have been much shorter, and much clearer, with the benefit of ideas like "equivalence class" --- which Marx couldn't've known about. (2) The labor theory of value seems even less plausible to me now than it did as a teenager. (Back then, I suspected there were arguments for it which I was missing; now I see there aren't any.) (3) I hereby apologize to those of my humanities and social-studies teachers who I nonetheless trolled with labor-theory arguments.
--- The promised notes.
C. J. Lyons, Blood Stained; Kill Zone; Hard Fall; Fight Dirty
Sequels to Snake Skin; mind candy, or perhaps mind popcorn. I confess I might not have kept up so far were it not for the perverse pleasure of seeing Lyons (and her characters) wreak havoc on familiar Pittsburgh neighborhoods.
Spoiler-y nit-picking about Kill Zone and sequels: I am not going to complain about the Abominable Afghan Antagonist in Kill Zone; if anything, I regard our graduating from faceless henchmen to the ranks of active villains as a mark of progress. (Though I note this is the second novel in which I've encountered an Afghan immigrant to Pittsburgh, and the second in which they're a plotting master-mind --- is there some local news story I missed?) I also don't object to the fact that the body count in the third book is explicitly over 60 deaths in one night, and by my rough count could easily have been 120. This would put the death toll in range of the Oklahoma City bombing, and indeed last year the whole of Allegheny County had only 108 homicides. So this would be a whole year's worth of killing in one night, and, per the story, not a year's worth of, so to speak, ordinary personal and criminal killing, but a targeted attack on the institutions and personnel of government (like Oklahoma City). To imagine that this wouldn't have cataclysmic political consequences for the whole nation is absurd --- hell, Lyon's characters talk about how it will have such consequences. And then in the subsequent books, none of those consequences follow. I still read, and enjoyed, those books, but I am a bit offended by the shoddy world-building. (Cf. Timothy Burke on lack of consequences in comic books.)
Keith Sawyer, Group Genius: The Creative Power of Collaboration
For the most part, this is pretty good popular social science about the social psychology and sociology of group creativity and problem solving. It appears, however, to be pitched as a business-advice book, which leads to a certain amount of "we are told by Science! to do X", when what science actually says is lot more ambiguous. (E.g., group creativity is almost certainly not maximized by a particular mean degree in social networks.) Also, it leads him to take the perspective of employers and corporations, as opposed to individual research workers (*). So I guess I'm making the usual reviewer's complaint of wishing that Sawyer had written a different, more academic book, as opposed to the one he wanted to write. But I learned some interesting things from it, and if I was a newcomer to this area I'd have learned quite a bit.
*: E.g., when companies use websites where they throw out problems to freelance researchers and pay for one successful solution, they are shifting the risk and uncertainty of the research process on to all the individuals who try to find solutions. This is obviously good for the corporations, but bad for the researchers. (It also offers the company more bargaining power against their in-house researchers, since it improves the company's disagreement payoff.)

Books to Read While the Algae Grow in Your Fur; Scientifiction and Fantastica; Writing for Antiquity; Enigmas of Chance; The Dismal Science; The Collective Use and Evolution of Concepts; Pleasures of Detection, Portraits of Crime; Heard About Pittsburgh PA

Posted at January 31, 2016 23:59 | permanent link

Three-Toed Sloth