## July 31, 2020

### Books to Read While the Algae Grow in Your Fur, July 2020

Attention conservation notice: I have no taste, and no qualifications to say anything about climatology, central Asian history, W. E. B. Du Bois, criminology, or post-modernity.

Samuel S. P. Shen and Richard C. J. Somerville, Climate Mathematics: Theory and Applications
I wanted to like this book much more than I did. It goes over some important pieces of math, not just for climatology but for lots of STEM fields, and the aim is "here's the main idea and how you use it, and leave the rigor to those who want it"; and it handles numerics in R. I was hoping to assign, if not all of it, then at least large chunks to my class on spatio-temporal statistics. As it is, I will just mine it for examples, but I won't even feel totally confident in that, unless I re-do them all.
To illustrate why, the final chapter is on "R Analysis of Incomplete Climate Data". This is good thing to include in an intro book, because (as they quite rightly say) real data sets almost always have missing values. They use a temperature data set from NCAR where missing values have been coded as -999, which is usually a bad practice (the Ancestors standards committees on floating-point numerical computation gave us NA for a reason), but, since they're Celsius temperatures, a value below absolute zero should be a warning to an alert user. After doing several examples where the -999.00s are taken literally, Shen and Somerville correctly say that coding missing values as -999.00 "can significantly impact the computing results" --- so "We assign missing data to be zero" (p. 286)! (Their code does not assign re-assign -999.00s to be zero, but without such a missing re-assignment their code would not produce the figure which follows this piece of text.) Even more astonishing, in section 11.4 (pp. 295ff), they handle this in the correct way, by replacing the -999s (strictly, values $< -490$) with NAs. In between, in section 11.3 (pp 293--295), they fit 9th and 20th (!) order polynomials to an annual temperature series from 1880--2016. "The choice of the 20th-order polynomial fit is because it is the lowest-order orthogonal polynomial that can mimic the detailed climate variations... We have tried higher-order polynomials which often show an unphysical overfit." (p. 294) --- I bet they do! The term "cross-validation" does not appear in the index, or I believe in the book. These are especially gross mis-steps, but I fear that stuff like this is lurking inthe data-analytic examples.
Other errors / causes of unhappiness (selected):
• Pp.127--128, the chemical symbol for helium is repeatedly given as "He2", though helium is, of course, a monoatomic gas (with an atomic weight of 3 or 4, depending on the isotope).
• P. 141, "Clearly the best linear approximation to the curve $y=f(x)$ at a point $x=a$ is the tangent line at $(a, f(a))$", with slope $f^{\prime}(a)$. This is not clear at all! If you want approximation at that point only, any line which goes through that point will work equally well, regardless of its slope. If you want approximation over some range, then the slope of the optimal linear approximation (in the mean-squared sense) is given by $\mathrm{Cov}(X, f(X))/\mathrm{Var}(X)$, which will equal $f^{\prime}(a)$ if $f(x)$ is a linear function. Now over a sufficiently narrow range, a well-behaved function will be well-approximated by the tangent line, i.e., a first-order Taylor approximation will work well. What counts as a "sufficiently narrow range" will depend on (i) how good an approximation you demand and (ii) the size of the remainder in Taylor's theorem. Since that remainder is $\propto (x-a)^2 f^{\prime\prime}(a)$, we need $|x-a|$ to be negligble compared to $1/\sqrt{|f^{\prime\prime}(a)|}$, which is a measure of the local curvature of the function. Requiring $|x| \ll 1$, as the authors do repeatedly, is neither here nor there.
• The book opens on a chapter with dimensional analysis. There is a good point to make here, which is that the units on both sides of an equation need to balance, and so the arguments to transcendental functions (like $e^x$ or $\log{x}$ or $\sin{x}$ or $\Gamma(x)$) should be dimensionless (generally, ratios of quantities with physical dimensions). This is a good way to avoid gross mistakes. But of course you can always make the units balance by sticking in the appropriate scaling factor on one side or another of the equation *. (When you do linear regression, $Y = \beta X + \mathrm{noise}$, the units of $\beta$ are always $\frac{[Y]}{[X]}$, and, e.g., an ordinary least squares estimate will respect this by construction.) Our authors want however to persuade the reader that dimensional analysis is a way to "discover useful formulas or laws of physics". Exhibit A for this is a purported derivation of the equation for the period $\tau$ of oscillation of a pendulum of length $l$ ($\tau \propto \sqrt{l/g}$) from sheer manipulation of units. Which is absurd, of course. Why should the period be a product of powers of the pendulum bob's mass, its length and the acceleration due to gravity alone? Even if we insisted on it being a product of powers of parameters, what about the amplitude of the oscillation (units: maximum horizontal displacement from the vertical axis), or the friction of the air, and/or the friction at the pivot point, and/or the speed of sound in the air, and/or the speed of sound in the pendulum rod? Dimension-juggling, in this case, happens to give the correct answer, once the right variables are being juggled. It's the answer one can derive from the actual physics, in the limit of a perfectly rigid rod swinging frictionless in a vacuum, and small amplitude oscillations (i.e., ones where $\sin{\theta} \approx \theta$ even for the largest angle of displacement $\theta$ from the horizontal). For larger amplitudes (still idealizing away friction), the period is still proportional to $\sqrt{l/g}$, but does involve a transcendental function of the maximum angle. In terms of basic quantities with physical dimensions, that angle is itself a transcendental function of both the length of the pendulum and the maximum horizontal displacement of the bob from the vertical axis. In short, this is an example where dimensional analysis only seems to work because we know the right answer to begin with, and reverse-engineer the problem set-up accordingly. (I claim the same is true of their other examples of dimensional analysis, but I lack to patience to go through them all.)
In any case, this claim to use dimensional analysis to work out physical laws from scratch is (wisely) dropped in the rest of the book. Thus chapter 5 is a decent introduction to "energy-balance" models of climate, based on the principle that a planet will heat (or cool) until the rate of energy coming in from the Sun matches the rate of energy being radiated away, since that rate increases with temperature. Specifically, the Stefan-Boltzmann law says that the rate at which a body at (absolute) temperature $T$ emits radiation is proportional to its surface area $A$ and to the fourth power of $T$, $P \propto AT^4$. I defy anyone to guess $T^4$ based on dimensional considerations alone **, but that's fine, all that dimensional analysis really forces is that there needs to be a power of $[T]^{-4}$ in the proportionality constant.
Shen and Somerville clearly know a lot more climatology than I ever will, and have been at this game a long time. (This is why they write R as though it were Fortran.) They are elders I really ought to respect. There's even a lot of good material in their book. But I really, really wish they'd written it with more care.
*: Thus when Boltzmann wanted entropy (SI units: $\mathrm{J} \mathrm{K}^{-1}$) to be proportional to the (unitless) log of the number of accessible states, he invented what we now call Boltzmann's constant. Not every failure to balance units is a discovery worthy of an eponym, but how is a student to tell the difference?^
**: Shen and Somerville don't even try, instead (sec. 7.5, pp. 191--195) correctly deriving it from Planck's law for the distribution of black-body radiation. I did remember that there was a non-quantum-mechanical, thermo-and-E&M, derivation (because I completely flubbed a problem set about it as an undergrad taking statistical mechanics), and Wikipedia yields it up; if I can apply my confused-tourist's German to 19th century scientific prose, it seems to be more or less Boltzmann's original approach. (Incidentally, if Prof. S. ever happens across this, I still feel embarrassed at how badly I did in your class! At least it made me more sympathetic to my own students' bouts of senioritis.)
Paul Dupuis and Richard S. Ellis, A Weak Convergence Approach to the Theory of Large Deviations
Enzo Olivieri and Maria Eulália Vares, Large Deviations and Metastability
Having already tried my best to explain what large deviations theory is, I will take that as read, and try to describe these books' contributions to it.
Dupuis and Ellis is about a (then) new way of proving large deviations results. "Weak convergence" or "convergence in distribution" says that a sequence of probability measures converges when they averages they give to functions converge, for all bounded, continuous functions. In large deviations theory, we have a sequence of probability measures converging exponentially fast, but only exponentially fast, to fixed limits. (Very roughly, $-n^{-1}\log{p(x)} \rightarrow h(x)$, for some "rate function" taking its minimum value of $0$ at a magic point $x^*$.) Laplace's principle is a way of approximating integrals of the form $\int{f(x) e^{-n h(x)} dx}$ by trading off the point where $f(x)$ is maximized from the point where $h(x)$ is minimized. Part of what Dupuis and Ellis do is show that large deviation principles, stated in terms of probability measures, can be equivalently expressed in terms of (simplifying) Laplace approximation working for all suitably well-behaved functions. The pay-off from doing this is that the integrals can then often be expressed in terms of solving an optimal control problem: how much do we have to shift the probability distributions to move the integral to a desired value? What's the cost of the cheapest intervention? This, in turn, they apply to a lot of problems convergence of stochastic processes, especially Markov processes where the kick the process gets depends on its current state, but the distribution of kicks changes continuously with the state, which they call "random walks with continuous statistics". (They also consider some amount of discontinuity.) While this is in principle self-contained, I'd really recommend prior acquaintance with large deviations, at least at the level of den Hollander's little book, or Dembo and Zeitouni.
Olivieri and Vares are a detailed treatment of what large deviations theory tells us about transitions from one (quasi-) stable state to another, how long we can expect a process to remain in the vicinity of a stable state, etc. This has been a key topic in large deviations theory since Freidlin and Wentzell in the 1970s, and the book contains a good precis of Freidlin-Wentzell theory, before moving on to new results. Many of these are inspired by statistical-mechanical problems, but as such an abstract level that no real knowledge of physics is required, or helpful. The same recommendations about prior knowledge apply here as with Dupuis and Ellis.
W. Barthold [= V. V. Bartol'd], Turkestan Down to the Mongol Invasion
This is archaic --- the first Russian edition is from 1900! --- but insanely detailed political/military history of Transoxiana and, secondarily, Khurasan, from the first Muslim invasions down to the immediate aftermath of the Mongol conquest, say from +650 to +1230. It's based primarily on the medieval Muslim historians and geographers (which, as an Orientalist, Barthold read in the original), supplemented with translated Chinese and Mongol sources towards the end.
In brief: the Arab Muslims invaded under the Umayyads, and gradually took permanent control of more and more of the territory, but seem to have left the local nobility (speaking Iranian-family languages and heavily influenced by Sassanian culture) intact, even including some traditional kingships, after conversion. The region backed the Abbasaids in their successful bid to overthrow the Umayyads, strengthening ties to the Caliphate. Central rule from Baghdad was gradually replaced by hereditary governors, drawn from the local nobility, most prominently the Samanids. Samanid rule passed to Turkish dynasties (shout-out here to my home town boy Mahmud of Ghanza), partly because of the slave-soldier institution but also because of increasing Turkish migration into Transoxiana and even into Khurasan. Thus we get a series of spectacular invasions by Turkish groups from further east and north, such as the Seljuks and the Kara-Khinaids, and even long-enduring native polities, like Khwarazem, get Turkish dynasties. Eventually, Khwarazem comes to dominate the region, only to spectacularly piss of Genghis Khan, provoking the westward invasion of the Mongols which sweeps all before it. (Barthold knows that all the accounts of provocation come from pro-Mongol sources, but is inclined nonetheless to believe them.)
Now imagine all of the reconstructable political and military details of these six centuries related at a rate of about a page a year.
One thing which struck me is how much uncertainty attaches even to very basic facts like "when exactly did this happen?" or "what was that person's name?" (Dates given by different sources don't agree; dates given by the same source don't agree; dates are given but they're impossible because this day of that month of such-and-such a year A.H. was not, in fact, a Tuesday; dates of so-and-so's rule are given by the sources but don't match up with the evidence of coinage; etc.)
Whitney Battle-Baptiste and Britt Rusert (eds.), W. E. B. Du Bois's Data Portraits: Visualizing Black America: The Color Line at the Turn of the Twentieth Century
The primary interest here is the reproduction of all the statistical graphics Du Bois created for the Paris World's Fair of 1900. They're accompanied by a set of contemporary scholarly essays, of which the best, I think, is the one by Aldon Morris (which moves his biography of Du Bois up my to-read queue). The essays mostly relate what Du Bois did to the rest of his career and to traditions of African-American scholarship, black studies, etc. This is entirely appropriate, but they're largely silent about something I'm curious about: how his work fit into the history of statistical graphics, and of the uses of visual displays of quantitative information in sociology and political economy. (In particular: was this something he learned to do in Berlin?) What graphics (if any) did the other exhibitions in Paris have?
Brendan O'Flaherty and Rajiv Sethi, Shadows of Doubt: Stereotypes, Crime, and the Pursuit of Justice
Morally serious and technically impeccable. (This blog post by Sethi gives a bit of a taste.) It deserves a very full review, which I will not give it.
Disclaimer: Prof. Sethi and I are both external faculty at the Santa Fe Institute, and have been known to say kind things about each other over the years. It'd be awkward for me to write publicly that this book was very bad, but I have no real incentive to praise it (other than thinking it worthy).
Rachel Bach, Fortune's Pawn, Honor's Knight, Heaven's Queen
Mind candy space opera. Tasty enough that I read all three in quick succession.
Jean-François Lyotard, The Postmodern Condition: A Report on Knowledge
I blame Adam Elkus for making me revisit this. (But I can't now find the post.) My copy was an artifact of my grad school days in the early 1990s, when I adhered very strictly to my mother's advice that bad ideas were "to be shot after a fair trial". (I've been told that that phrase isn't funny anymore.) My remarks spun somewhat out of control, so they'll be a separate review. In short: why did anyone care so much about this?

Posted at July 31, 2020 23:59 | permanent link