Books to Read While the Algae Grow in Your Fur, July 2012
Attention conservation notice: I have no taste.
- Laurence
Gough, Karaoke Rap and Funny
Money
- Mind candy. Mis-adventures of scheming, amoral low-lives in Vancouver;
also the criminals they're supposed to be catching. (I kid, but Willows and
Parker's colleagues are not very pre-possessing.) As always, Gough does a very
nice line in studiously disabused narrative.
- 10th and 12th (!) books in a series (some previous installments:
nos. 1, 2, 3, 7
and 8), but both self-contained.
- Josh
Bazell, Wild Thing
- Mind candy. Sequel to Beat the
Reaper, though it can be read separately. This is funny and
exciting by turns (and I really like the footnotes, even when I want to argue
with them), but not as well-constructed as its predecessor.
- Diana Rowland, Even White Trash Zombies Get the Blues
- Mind candy. The continuing
travails of the titular white trash zombie, as she tries to keep herself
supplied with brain slurpees, and on the right side of her parole officer.
- Karin
Slaughter, Criminal
- Mind candy. Equal parts gripping (if squicky) thriller, and portrait of
struggling against entrenched sexism
in 1975
Atlanta. Part of a long-running series
(previously), but I think one could
jump in here, without loss.
- Spoilery remark: V unq orra cerfhzvat nyy guvf gvzr gung Nznaqn jbhyq
ghea bhg gb or Jvyy'f zbgure. V gnxr fbzr fngvfsnpgvba, ubjrire, va
Fynhtugre'f cebivqvat na rkcynangvba sbe gubfr pyhrf...
- Olivier
Catoni, Statistical Learning Theory and Stochastic
Optimization [Free PostScript]
- Lots of finite-sample results about randomized and aggregated predictors,
with what Catoni nicely describes as a "pseudo-Bayesian" flavor. Specifically,
he puts a lot of emphasis on "Gibbs estimators", which go as follows. Start
with a space \( \Theta \) of models, where each model \( \theta \) gives us a
distribution over samples, say \( q(x;\theta) \). Stick a prior measure \(
\pi(d\theta) \) over the model space, and fix an "inverse temperature" \( \beta
> 0 \). Nature generates data according to some distribution \( \mathbb{P} \)
which, in general, has nothing to do with any of our models. After seeing data
\( x_1, x_2, \ldots x_n \equiv x_1^n \), we predict \( x_{n+1} \) by averaging
over models, according to the Gibbs measure / exponential family /
pseudo-posterior
\[
\rho(d\theta) = \frac{\left(q(x_1^n;\theta)\right)^{\beta}}{\int{\left(q(x_1^n;\theta^{\prime})\right)^{\beta} \pi(d\theta^{\prime})}}\pi(d\theta)
\]
The point of doing this is that if \( \beta \) is chosen reasonably, then the
expected log-likelihood, predicting according to \( \rho \), is always within
\( O(1/n) \) of the expected log-likelihood of the best models in \( \Theta \).
(Catoni actually calculates the constant buried in the \( O(1/n) \), but the
answer is more complicated than I feel like writing out.) Here, importantly,
expectations are all taken with respect to the true distribution \(
\mathbb{P} \), not the prior \( \pi \). This would not be true if one
did a straight Bayesian model averaging with \( \beta=1 \).
- As the name "Gibbs estimator" suggests, Catoni milks the thermodynamic
analogy for all its worth, and much of chapters 4--6 is about approximating
free energies and even susceptibilities. (I suspect that some of
these results are superseded by Maurer's brilliant "Thermodynamics and
Concentration" paper
[arxiv:1205.1595], but am
under-motivated to check.) These analytical results are about proving
generalization error bounds; when it comes to actually doing stuff, Catoni
still recommends sampling from the (pseudo-) posterior with Monte Carlo, hence
the last chapter on transitions in Markov chains. There is also a natural
connection with the results about compressing individual sequences which open
the book.
- The notation is very detailed and sprawling, and I often found it hard to
follow. (The writing sometimes seems to lose the forest for the leaves.) But
many of the results are quite powerful, and I will be keeping my copy for
reference.
- Why oh why can't we have a better academic publishing system?
dep't.: Prof. Catoni has, generously,
put PostScript
next-to-final draft of the book on his website. (Free
is, to repeat, the economically efficient price.)
Comparing this to the printed edition shows that Springer did absolutely
nothing to the manuscript of any value to any reader. (They didn't even run it
through an English spell-checker: e.g., for "Larve" in the title of section
7.3, read "Large".) They did, however, print and bind it, and put it in the
distribution channels to libraries. For this, they charge \$69.95 per copy
— none of which goes as royalties to the author. (I got my copy second
hand.) This is not as ridiculous as what they charge for access to individual
articles, but still exactly what I mean when I say that commercial academic
publishing has become a parasitic obstacle to the growth and dissemination of
knowledge.
- V. N. Vapnik, The Nature of Statistical Learning Theory
- I recently had occasion to
revisit this book, and to re-read my
review from 1999. The main thing I would change in the review is to bring
out more strongly Vapnik's insistence on bounds on generalization error which
hold uniformly across all data-generating distributions. It
is this which makes finite VC dimension a necessary condition for his
notion of learning. One could learn a model from a family of infinite VC
dimension if the family was adapted to the distribution --- say, if
the VC entropy was well-behaved.
- Still strongly recommended, with my old caveats.
Books to Read While the Algae Grow in Your Fur;
Pleasures of Detection, Portraits of Crime;
Enigmas of Chance;
Scientifiction and Fantastica
Posted at July 31, 2012 23:59 | permanent link