Books to Read While the Algae Grow in Your Fur, January 2010
- Virginia Swift, Hello, Stranger
- Mind-candy. Enjoyable mystery with eccentric academics, God-botherers and
gentrification in present-day Laramie. 4th book in a series; I'll keep an eye
out for the others.
[Later: vols. 1--3]
- Intelligence
- Smart crime/spook drama set in one of the most attractive cities in the
world (Vancouver), which could only be improved if it didn't end in the WORST
CLIFFHANGER EVER. (Ahem.) Not, of course, as good as The Wire,
but then nothing is.
- Daniel Waley, The Italian City-Republics
- Short, readable political-institutional history of the communes of northern
and central Italy. He begins with the communes starting to take form in the
towns and wrest control from their bishops, say around 1000, and ends by about
1400, by which point the towns had almost all, except
for Venice,
descended into some form of monarchy, generally under the domination of the
local feudal land/war-lords. (Waley says little about Venice, which in
retrospect seems odd, though it didn't strike me while reading it.) While
Waley is good at describing this historical trajectory, he says little about
why so many Italian cities followed it. I'd think it'd be natural to
compare the Italian case to contemporary cities elsewhere, but I think there is
exactly one sentence on them. (I imagine all kinds of interesting comparative
work could be or has been done.) But within those limits, it's a nice book.
Waley has also written studies on Siena and Orvieto, which sound interesting.
- Terry Pratchett, Nation
- You don't really need me to recommend Terry Pratchett to you, especially
when he's writing about how people find ways to go on when their world has been
pointlessly destroyed.
- Richard
Hofstadter, Anti-Intellectualism in American
Life
- Astonishingly, this still feels like it fits after a lapse of half
a century. The whole
"tax-raising,
latte-drinking, sushi-eating, Volvo-driving, New-York-Times-reading,
body-piercing, Hollywood-loving, left-wing freak-show" nonsense of the last
thirty years now makes a lot more sense; and the chapters about the history of
American education were frankly a revelation to me. (The chapter on Dewey and
his pedagogical influence seems like a model of being respectfully but
unrelentingly critical.) No doubt for real historians, this is all painfully
outdated, and whatever's actually sound has long since been incorporated into
other works, which don't provide such unintentional moments of amusement as,
when listing the unfair accusations heaped on Jefferson, including keeping a
slave mistress and having children by her. (For that matter I don't care for
the Beats very much, but they certainly contributed more to our literature than
he thought they would.) Still: the man could write.
- ObLinkage: Steve Laniel on AIiAL.
- D. N. MacKenzie (trans.), Poems from the Divan of Khushâl Khân Khattak
- The first significant body of poetry in
Pashto; Khushal
was a 17th century warlord in what is now the Northwest Frontier, owing his
position to a combination of tribal authority and appointment by the Mughals.
This seems to be the most recent translation of a selection from his poetry in
English, dating from 1965. It is arranged on no particular principles (some
Pashto editions are, following tradition, arranged alphabetically by the first
letter of the poem), which produces a rather odd effect, that I might summarize
as follows: Khushal is happily in love: wow is the beloved a hottie. Khushal
is unhappily in love: separation is awful, especially if it's because the
beloved doesn't want to see Khushal. Khushal is a fierce warrior who is also a
keen hunter; falconry rules. Khushal has a remarkable capacity for drink. (Go
ahead, try and tell me that's allegorical.) Aurangzeb sucks, especially in
comparison to his father.
(Well, he did, and
sticking Khushal in jail can't have won him any points.) The Afghans should
rally to Khushal and defeat Aurangzeb! Men are treacherous, false-faced
bastards, but Afghans are really worse than the rest. (To be fair, having one
of your own sons wage war on you in the name of Aurangzeb has got to be pretty
embittering.) Khushal will withdraw from the sinful world and spend his days
in pious penance. Khushal glorifies God. Repeat.
- My grandfather's extemporized translations were better English poetry, but
I will never hear those again.
- Moez Draief and Laurent Massoulié, Epidemics and Rumors
in Complex Networks
- A nice short (< 120 pp.) account of the connections among stochastic
network models, branching processes, and epidemic models, of the
"susceptible-infectious-susceptible" or "susceptible-infectious-recovered"
type, including epidemics on networks. ("Rumors" are assumed to fall under
such models.)
- They begin with the basic Galton-Watson branching process model, where each
member of a population produces a random number of descendants (possibly zero),
independently of everyone else, and this distribution is constant both within
and across generations. Following over a century of tradition, they look at
whether the population survives forever or goes extinct, how large it gets, how
long it takes to go extinct if it does, etc. This then gets turned into a
simple epidemic model ("member of population" = infected individual). It also
maps on to the Erdos-Renyi network model, with "has an edge with" taking the
place of "is a descendant of": pick your favorite node, and connect it to a
random selection of other nodes, the number following a binomial distribution;
connect each of them in turn to more random nodes. The size of the branching
process's population corresponds to the size of the connected component in the
graph. The mapping really only really works in the limit of low-density graphs
(the size of the component is roughly a sum of independent quantities
when there are no loops), but it's enough to study the emergence of a giant
component and the behavior of the diameter of the graph. As a prelude to more
sophisticated models, they then prove a form
of Kurtz's Theorem on the convergence of
Markov chains to ordinary differential equations in the large-population limit.
The second half of the book
rehearses Watts-Strogatz
small-world and
Barabási-Albert
scale-free networks (including mention of Yule but not, oddly,
of Herbert Simon), before
wrapping up with epidemic models on graphs, and the "viral marketing" problem
of deciding where, on a known and fixed network, to start an epidemic for
maximum impact.
- Of course, since it's a mathematics book, the problem of how to link these
models to data isn't even dismissed.
- This isn't a ground-breaking work, but it's nice to have all this in a
single book, and one a bit more accessible than, say, Durrett's
Random
Graph Dynamics (though by the same token less comprehensive). The
implied reader is comfortable with stochastic processes at the level of
something
like Grimmett
and
Stirzaker; measure-theoretic
issues are avoided, even when discussing Kurtz's Theorem. (Their version
is thus much less precise and powerful than his, but vastly easier to
understand.) Anyone comfortable with that level of probability could read it
without much trouble, and I'd happily use it in a class.
- Disclaimer: I read a draft of the manuscript for the publisher
in 2007, and they sent me a free copy of the book, but I have no stake in its
success.
- Joseph L. Graves, Jr., The Emperor's New Clothes: Biological
Theories of Race at the Millennium
- There are places where he lapses into biological jargon, and others where I
think lay readers would have benefited from more detailed rebuttals of the
common counter-arguments, but over-all I recommend this very strongly. (Thanks
to I.B. for lending me her copy.)
- Pascal
Massart, Concentration
Inequalities and Model Selection
- Using empirical
process theory, and more specifically concentration of
measure, to get finite-sample, i.e., non-asymptotic, risk bounds for
various forms of model
selection. The basic strategy is to find conditions under which every
model in a reasonable class will, with high probability, perform about as well
on sample data as they can be expected to do on new data; this involves
constraining the richness or flexibility of the model class. A little extra
work, and the addition of suitable penalties to the fit, gets bounds that
extend over multiple classes of model, even over a countable infinity of
classes. Among other highlights, Massart shows why the famous AIC
heuristic is often definitely sub-optimal, and how to correct it; it
also offers corrections to Vapnik's (much
better) structural risk minimization,
and a nice treatment of data-set splitting (= 1-fold cross-validation). All of
this is for IID data, so the usual caveats apply. Formally self-contained, but
realistically some previous exposure to empirical processes (at the level of
Pollard's notes if not higher) will be
needed. Available for free as
a large PDF
preprint, but I found it much more convenient to read a dead-tree
copy.
- Elizabeth Bear, New Amsterdam
- Alternate-history fantasy mystery stories. Owing something, perhaps, to
Randall Garrett's "Lord Darcy" stories (the name of the heroine is distinctly
suspicious), but without their complacency about the benevolence of the powers
that be.
- David Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining
- I've used this three times now in
teaching 36-350, with
about 75 students total over the years. I keep using it because it's the best
textbook on data-mining I know. It
covers the whole process, soup to nuts: data collection (and the importance of
understanding what the data actually mean, if anything), cleaning, databases,
model construction, model evaluation, optimization, visualization, etc. All of
this is organized around four crucial questions: what kind of pattern are we
looking for in the data, and how do we represent those patterns? how do we
score representations against each other? how do we search for good
representations? what do we need to do to implement that search efficiently?
All of the basic methods (and many not so basic ones) are in here, all seen as
different answers to these questions. I find its explanations extremely clear,
and my students seem to as well. I regard it as a strength that it
is not tied to pre-canned software, which would only encourage
dependency and thoughtlessness.
- The only real competition, to my mind,
is Hastie,
Tibshirani and Friedman. But the Stanford book is distinctly more
about statistics, and has more statistical theory and math (though
not, from my point of view, a lot of either), whereas this one is
distinctly focused on data-mining and on computation. It would be nice if Hand
&c. had material on support vector machines, and more on ensemble methods;
perhaps it's time for a second edition?
- Disclaimer: I almost took a post-doc under Smyth rather than
coming to CMU, back in 2004; also, the MIT Press sent me a free review copy of
this book (in 2001).
Books to Read While the
Algae Grow in Your Fur;
Pleasures of Detection, Portraits of Crime;
Enigmas of Chance;
Scientifiction and Fantastica;
Writing for Antiquity;
Afghanistan and Central Asia;
The Natural Science of the Human Species;
Networks;
The Beloved Republic;
The Commonwealth of Letters;
Learned Folly
Posted at January 31, 2010 23:59 | permanent link