## May 31, 2015

### Books to Read While the Algae Grow in Your Fur, May 2015

Attention conservation notice: I have no taste.

Cixin Liu, The Three-Body Problem (translated by Ken Liu [no relation])
A really remarkably engrossing novel of first contact. (I will refer you to James Nicoll for plot summary.) As a novel of first contact, I think it bears comparison to some of the classics, like War of the Worlds and His Master's Voice: it realizes that aliens will be alien, and that however transformative contact might be, people will continue to be human, and to react in human ways.
— It has a lot more affinities with Wolf Totem than I would have guessed --- both a recognizably similar mode of narration, and, oddly, some of the content — educated youths rusticated to Inner Mongolia during the Cultural Revolution, environmental degradation there, and nascent environmentalism. Three-Body Problem works these into something less immediately moving, but perhaps ultimately much grimmer, than Wolf Totem. I say "perhaps" because there are sequels, coming out in translations, which I very eagerly look forward to.
Elif Shafak, The Architect's Apprentice
Historical fiction, centered on the great Ottoman architect Sinan, but told from the viewpoint of one of his apprentices. I am sure that I missed a lot of subtleties, and I half-suspect that there are allusions to current Turkish concerns which are completely over my head. (E.g., the recurrence of squatters crowding into Istanbul from the country-side seems like it might mean something...) Nonetheless, I enjoyed it a lot as high-class mind candy, and will look for more from Shafak.
ROT-13'd for spoilers: Ohg jung ba Rnegu jnf hc jvgu gur fhqqra irre vagb snagnfl --- pbagntvbhf phefrf bs vzzbegnyvgl, ab yrff! --- ng gur raq?
Barry Eichengreen, Hall of Mirrors: The Great Depression, The Great Recession, and the Uses — and Misuses — of History [Author's book site]
What it says on the label: a parallel history of the Great Depression and the Great Recession, especially in the US, and of how historical memories (including historical memories recounted as economic theories) of the former shaped the response to the latter.
If anyone actually believed in conservatism, a conservative paraphrase of Eichengreen would run something as follows: back in the day, when our ancestors came face to face with the consequences of market economies run amok, our forefathers (and foremothers) created, through a process of pragmatic trial and error, a set of institutions which allowed for an unprecedented period of stable and shared prosperity. Eventually, however, there arose an improvident generation (mine, and my parents') with no respect for the wisdom of its ancestors, enthralled by abstract theories, a priori ideologies, and Utopian social engineering, which systematically dismantled or subverted those institutions. In the fullness of time, they reaped what they had sown, namely a crisis, and a series of self-inflicted economic would, which had no precedent for fully eighty years. Enough of the ancestors' works remained intact that the results were merely awful, however, rather than the sort of utter disaster which could lead to substantial reform, or reconsideration of ideas. And here we are.
(Thanks to IB and ZMS for a copy of this.)
David Danks, Unifying the Mind: Cognitive Representations as Graphical Models
This book may have the most Carnegie Mellon-ish title ever.
Danks's program in this book is to argue that large chunks of cognitive psychology might be unified not by employing a common mental process, or kind of process, but because they use the same representations, which take the form of (mostly) directed acyclic graphical models, a.k.a. graphical causal models. In particular, he suggests that representations of this form (i) give a natural solution to the "frame problem" and other problems of determining relevance, (ii) could be shared across very different sorts of processes, and (iii) make many otherwise puzzling isolated results into natural consequences. The three domains he looks at in detail are causal cognition (*), concept formation and application, and decision-making, with hopes that this sort of representation might apply elsewhere. Danks does not attempt any very direct mapping of the relevant graphical models on to the aspects of neural activity we can currently record; this strikes me as wise, given how little we know about psychology today, and how crude our measurements of brain activity are.
Disclaimer: Danks is a faculty colleague at CMU, I know him slightly, and he has worked closely with several friends of mine (e.g.). It would have been rather awkward for me to write a very negative review of his book, but not awkward at all to have not reviewed it in the first place.
*: Interestingly to me, Danks takes it for granted that we (a) have immediate perceptions of causal relations, which (b) are highly fallible, and (c) in any case conform so poorly to the rules of proper causal models that we shouldn't try to account for them with graphical models. I wish the book had elaborated on this, or at least on (a) and (c).
F. Gregory Ashby, Statistical Analysis of fMRI Data
This is another textbook introduction, like Poldrack, Mumford and Nichols, so I'll describe it by contrast. Ashby gives very little space to actual data acquisition and pre-processing; he's mostly about what you do once you've got your data loaded into Matlab. (To be fair, this book apparently began as the text for one of two linked classes, and the other covered the earlier parts of the pipeline.) The implied reader is, evidently, a psychologist, who knows linear regression and ANOVA (and remembers there's a some sort of link between them), and has a truly unholy obsession with testing whether particular coefficients are exactly zero. (I cannot recall a single confidence interval, or even a standard error, in the whole book.) Naturally enough, this makes voxel-wise linear models the main pillars of Ashby's intellectual structure. This also explains why he justifies removing artifacts, cleaning out systematic noise, etc., not as avoiding substantive errors, but as making one's results "more significant". (I suspect this is a sound reflection of the incentives facing his readers.) To be fair, he does give very detailed presentations of the multiple-testing problem, and even ventures into Fourier analysis to look at "coherence" (roughly, the correlation between two time series at particular frequencies), Granger causality, and principle and independent component analysis [1].
This implied reader is OK with algebra and some algebraic manipulations, but needs to have their hand held a lot. Which is fine. What is less fine are the definite errors which Ashby makes. Two particularly bugged me:
1. "The Sidak and Bonferroni corrections are useful only if the tests are all statistically independent" (p. 130): This is true of the Sidak correction but not of the Bonferroni, which allows arbitrary dependency between the tests. This mistake was not a passing glitch on that one page, but appears throughout the chapter on multiple testing, and I believe elsewhere.
2. Chapter 10 repeatedly asserts that PCA assumes a multivariate normal distribution for the data. (This shows up again in chapter 11, by way of a contrast with ICA.) This is quite wrong; PCA can be applied so long as covariances exist. The key proposition 10.1 on p. 248 is true as stated, but it would still be true if all instances of "multivariate normal" were struck out, and all instances of "independent" were replaced with "uncorrelated". This is related to the key, distribution-free result, not even hinted at by Ashby, that the first $k$ principal components give the $k$-dimensional linear space which comes closest on average to the data points. Further, if one does assume the data came from a multivariate normal distribution, then the principle components are estimates of the eigenvectors of the distribution's covariance matrix, and so one is doing statistical inference after all, contrary to the assertion that PCA involves no statistical inference. (More than you'd ever want to know about all this.) [2]
The discussion of Granger causality is more conceptually confused than mathematically wrong. It's perfectly possible, contra p. 228, that activity in region $i$ causes activity in region $j$ and vice versa, even with "a definition of causality that includes direction"; they just need to both do so with a delay. How this would show up given the slow measurement resolution of fMRI is a tricky question, which Ashby doesn't notice. There is an even deeper logical flaw: if $i$ and $j$ are both being driven by a third source, which we haven't included, then $i$ might well help predict ("Granger cause") $j$. In fact, even if we include this third source $k$, but we measure it imperfectly, $i$ could still help us predict $j$, just because two noisy measurements are better than one [3]. Indeed, if $i$ causes $j$ but only through $k$, and the first two variables are measured noisily, we may easily get non-zero values for the "conditional Granger causality", as in Ashby's Figure 9.4. Astonishingly, Ashby actually gets this for his second worked example (p. 242), but it doesn't lead him to reconsider what, if anything, Granger causality tells us about actual causality.
While I cannot wholeheartedly recommend a book with such flaws, Ashby has obviously tried really hard to explain the customary practices of his tribe to its youth, in the simplest and most accessible possible terms. If you are part of the target audience, it's probably worth consulting, albeit with caution.
[1] Like everyone else, Ashby introduces ICA with the cocktail-party problem, but then makes it about separating speakers rather than conversations: "Speech signals produced by different people should be independent of each other" (p. 258). To be fair, I think we've all been to parties where people talk past each other without listening to a thing anyone else says, but I hope they're not typical of Ashby's own experiences.
[2] Of course, Ashby introduces PCA with a made-up example of two test scores being correlated and wanting to know if they measure the same general ability. Of course, Ashby concludes the example by saying that we can tell both tests do tap in to a common ability by their both being positively correlated with the first principal component. You can imagine my feelings.
[3] For the first case, say $X_i(t) = X_k(t) + \epsilon_i(t)$, $X_j(t) = X_k(t) + \epsilon_j(t)$, with the two noise terms $\epsilon_i, \epsilon_j$ independent, and $X_k(t)$ following some non-trivial dynamics, perhaps a moving average process. Then predicting $X_i(t+1)$ is essentially predicting $X_k(t+1)$ (and adding a little noise), and the history of $X_i$, $X_i(1:t)$, will generally contain strictly less information about $X_k(t+1)$ than will the combination of $X_i(1:t)$ and $X_j(1:t)$. For the second case, suppose we don't observe the $X$ variables, but $B=X+\eta$, with extra observational noise $\eta_t$ independent across $i$, $j$ and $k$. Then, again, conditioning on the history of $B_j$ will add information about $X_k(t+1)$, after conditioning on the history of $B_i$ and even the history of $B_k$.
Lauren Beukes, The Shining Girls
A time-traveling psycho killer (a literal murder hobo) and his haunted house versus talented and energetic ("shining") women of Chicago throughout the 20th century. I cannot decide if this is just a creepy, mildly feminist horror novel with good characterization and writing, or if Beukes is trying to say something very dark about how men suppress female ability (and, if so, whether she's wrong about us).

Posted at May 31, 2015 23:59 | permanent link