Books to Read While the Algae Grow in Your Fur, August 2011
Attention conservation notice: I have no taste.
- Vladimir
Vovk, Alex Gammerman
and Glenn
Shafer, Algorithmic Learning in a Random World
- This is a badly-written book full of interesting results and ideas. The
basic goal is simple: rather than making point forecasts, make predictions in
the form of confidence sets, in such a way that the stated confidence level
really does correspond to the actual probability of being right. An obvious
approach would be to use Bayesian updating to form posterior-predictive sets,
but those come with no guarantees of correct coverage, unless the prior is
right, and
indeed the
Bayesian posterior probabilities can be arbitrarily bad (which is one
reason why Bayesians need to test their models).
Another tack would be to form a frequentist predictive distribution, but, while
these exist, they're finicky and delicate.
- The trick used in this book is wonderfully simple. Suppose data points are
exchangeable (i.e., come from a "random world"), and we have a goodness-of-fit
test which gives us a sensible (uniformly distributed) p-value. After
observing a sequence of n data-points, consider all possible values for
data-point n+1, and calculate their p-values. The ones which
cannot be rejected at level a form the prediction set, with confidence
level 1-a. All that is really needed for this to work is that we have
some way of measuring the discrepancy or "conformity" of one data point with
the others which gives uniformly-distributed ranks under the null hypothesis*.
(This is why the authors call their scheme "conformal prediction"; it has
nothing to do with
conformal mappings in
geometry, much
less conformal
field theory.) Actually calculating the prediction set in a reasonable way
depends on the details of the conformity measure; they show that
nearest-neighbor prediction, ridge regression, and some sorts of support vector
machines are fairly easily handled.
- The basic idea can be elaborated into predicting distributions ("Venn
predictors"), into conditional confidence levels, into rescuing Bayesian
prediction intervals, and in some situations into handling dependent data. For
the last, they consider a set-up they call "on-line compression modeling",
which amounts to postulating
what Lauritzen calls
a "totally
sufficient" statistic, i.e., one which not only is sufficient in the
ordinary sense, but which can be updated recursively, and screens off past and
future observations. (Actually, I think that all they really need is a
predictive Markovian representation, which can be constructed
in great generality;
in continuous time and for non-stationary processes, even.)
- The book is, as I said, badly written. Formally, it only requires
knowledge of stochastic processes to the point of understanding exchangeability
(and de Finetti's theorem), martingales and Markov processes (and there are
appendices to refresh the reader on measure-theoretic probability), and of
statistics as far as regression, goodness-of-fit testing and confidence
intervals. In practice, readers will find acquaintance with standard machine
learning ideas, as found in
e.g. Hastie,
Tibshirani and Friedman, essential. Even with this background, the
brilliant clarity of the main ideas is obscured by a large mass of unnecessary
detail, non-standard notation and terminology (e.g., refusing to consider
sequences of observations in favor of multisets, a.k.a. "bags", indicated by
extra symbols; or eschewing the idea of sufficiency in the chapters on "on-line
compression modeling"), and some rather dubious philosophy. (The distinction
between "inductive" and "transductive" learning is neither defensible** nor
even fruitful, and I say this with very deep respect
for Vladimir Naumovich.) The obvious
connections to frequentist
prediction intervals, and
to Butler's predictive
likelihood, go unexplored. This is all unfortunate, but until someone
writes a cleaner and clearer account of the theory, I have little choice but to
recommend this to anyone with a serious interest in machine learning or
statistical prediction.
- *: I am indebted to Larry
Wasserman for pointing out the importance of uniform ranking, and for
discussing his work on extending these results, which he really ought to
publish.
- **: Supposedly, "transduction" is reasoning
directly from the properties of individual observed cases to those of
individual unobserved cases, without first inducing a general rule, and then
deducing specific instances from it. Clearly, any inductive procedure can be
turned into a transductive one simply by composition of functions. Conversely,
any transductive procedure can be turned into an inductive one, by considering
hypothetical new unobserved cases so as to map out the general rule. This is
thus a distinction without a difference in terms of capacities. At
most there might be a difference in terms of algorithmic representations
(and computational complexity), but that's not relevant to the probabilistic or
statistical theory undertaken here.
- Update, 1 September: Shiva Kaul writes me to remonstrate with me
about transduction. I quote his letter (with permission):
I think transduction (in the modern sense of the word, perhaps not
what Vovk et al discuss) is statistically distinct from induction. I'm
not aware of any transductive sample complexity upper bounds that beat
corresponding lower bounds for inductive sample complexity. However,
transductive upper bounds often beat inductive ones, e.g., "Collaborative Filtering with the Trace Norm: Learning, Bounding, and Transducing".
The reduction you posted doesn't work for matrix completion. By considering
a hypothetical new missing entry, one eliminates a present entry, which could
change the predicted values for the other missing entries.
My
superficial impression from the paper Shiva points me to is that it deals with
a finite set of objects (entries in a matrix), and the difference between the
"inductive" and "transductive" set-ups comes from the former sampling entries
with replacement, which is kind of silly in this context, while the latter does
not. But clearly I need to read and think more deeply before being entitled to
an opinion. (This concludes this edition of Shalizi Smackdown Watch.)
- Tony Judt, Postwar: A History of Europe Since 1945
- A massive, but utterly satisfying, total history of the European
subcontinent since the close of the Second World War — which, of
necessity, involves going back before the war for many things. Judt makes no secret of the fact that his sympathies lie with
anti-Communist liberal social democracy. (He strives very hard to be fair,
— his portrait of Thatcher, for instance, shows real respect, though no
admiration — but I am clearly not best-placed to say if he succeeded.)
Accordingly, to his mind the great and incredible accomplishment of western
Europe is not just its recovery, but the construction, in the democratic
welfare states, of one of the most free and most just forms of life humanity
has yet known, intertwined with a new and uniquely peaceful form of
international relations through the European Union. (He is very sound on the
role the United States played in encouraging these developments, which we
should be proud of.) That all these institutions were created with mixed
motives, and are more or less flawed and corrupt, goes with their being human
creations, and does not reduce their accomplishments. This story is
contrasted, intelligently, with that of eastern Europe under Communist rule,
ending with its startlingly peaceful dissolution, with due attention paid to
Gorbachev's remarkable, if entirely un-intentional, achievements. (The one
place where I find myself seriously questioning Judt's interpretations is his
insistence that the Soviet economy could not be reformed without undermining
Communist rule. Here he draws on local economists
like János
Kornai, and the argument even makes some sense, but how does it explain
China and Vietnam?)
- Judt does an outstanding and remarkable job of giving even coverage across
space, across time, and across domestic and international politics, the
economy, social life, popular and high culture, intellectual affairs, and
connections and contrasts among all of these. (The only major area of endeavor
he slights is the history of science and technology, for understandable
reasons.) He moves seamlessly and illuminatingly from the economics
of post-war reconstruction to criticism of films of the 1940s, and then to a
[very characteristic] consideration of the content of collective memories of
the war. Remarkably, he accomplishes all of this while not presuming that his
readers know the story already. I recommend it most highly.
- — Some of the passages here are recycled from essays collected
in Reappraisals (or
perhaps vice versa, considering how long he was working on this book).
- Charlie Stross, The Fuller Memorandum
- Mind-candy. Continuing Lovecraftian
spy-fiction, with office comedy. These have never been quite as light-hearted
as they first seem, but this one has some genuinely creepy and disturbing
scenes and images. Enjoyable independently of previous books in the series,
for certain values of "enjoyment".
- (But it seems to me that Bob is unduly shaken in his atheism. [Since this
all comes up in the first few pages, I don't count it as spoilers.] Yes, his
universe has immensely powerful and ancient alien intelligences, some of whom
take an interest in humanity. But that no more makes it a genuinely theistic
universe than that of a Helicobacter living in a human gut. Ancient,
powerful entities operating under weird-seeming rules of physics are not
eternal, omnipotent supernatural beings. This is another expression of
MacLeod's apophatic atheology.)
- Margaret Maron, Storm Track; Slow Dollar; High Country Fall;
Rituals of the Season; Winter's Child; Hard Row; Death's Half Acre
- Why yes, I did basically spend a week in bed trying to distract myself from
dental pain, how could you tell? These books go down like small, pleasant bits
of candy, but like a lot of mystery stories they are also social fiction, the
on-going theme here being the transformation of rural society in the
South.
- Benjamin
I. Schwartz, The World of Thought in Ancient China
- Fairly standard exposition of Chinese philosophy and some of its background
through, roughly, the beginning of
the Qin dynasty and
the First Emperor,
i.e., mostly
the Hundred
Schools of
the Warring States
period. I did not actually find it any more enlightening than, say, Fung
Yu-lan's old
book, let alone something like
Graham's Disputers
of the Tao. The main distinguishing features of Schwartz's book
seem to be as follows. (1) Presuming the reader is already familiar with the
broad outlines of the history, both political and intellectual. (2) Spending a
lot of time disputing modern writers without bothering to fully expound their
views (e.g., the argument
with Fingarette in
the chapter on Confucius, or with Needham in the chapter on cosmology*, both of
which would have been impenetrable had I not read the other authors first), or
even contrasting with more-or-less fashionable thinkers of the early 1980s
(Geertz?!). The
occasional stabs at, say, contrasting Confucius's ideas about ethics in public
and private life with those of Plato and Aristotle are not sustained enough to
really count as comparative history. Finally, (3) many very vague causal
speculations, e.g., that the prevalence of ancestor worship made Chinese
civilization more receptive to "universal monarchy" than other parts of the
world. (I don't suppose that's impossible, but how on Earth could we tell?)
In the end, I got a bit bored, and wouldn't really recommend this for
non-specialists; try Disputers instead, or even Waley's vintage
but
engaging Three
Ways of Thought in Ancient China. I am not, of course, qualified to
say if it has any value for specialists in Chinese intellectual history.
- Update: I am told, by someone who took Schwartz's classes at
Harvard, that he was an inspiring teacher; I can well believe it. It's
striking, and from my point of view a bit sad, how often great teaching fails
to translate to the printed page, or for that matter vice versa.
- *: To be clear, I think that Schwartz
is right in his criticisms of Fingarette and Needham. The former's
book on Confucius is a mere period piece from a now-abandoned phase of
analytical philosophy; the latter engaged in a lot of speculation, wishful
thinking, and sheer projection when writing about the "five elements" school.
(This does not invalidate the scholarly value of Science and Civilisation
in China.) But these hardly seems like one of the most important things
to say about either school.
- Megan Lindholm, Luck of the Wheels
- Mind-candy fantasy novel; the fourth book in a series I haven't read, which
I picked up because
Lindholm's The
Wizard of the Pigeons is a neglected classic of urban fantasy (from
before that sub-genre got locked into its current formula), and I was curious
about her other books. The first two-thirds or so of Luck of the
Wheels is an amusing picaresque with some truly dreadful adolescents,
followed by a blood-soaked revenge drama, finishing with what under the
circumstances has to count as a happy ending, though from the viewpoint of the
start of the novel it's an utter disaster. I am especially intrigued by the
fact that every step in this transformation follows plausibly from the previous
one. I will keep an eye peeled for the other books in this series.
- — Incidentally, until looking up her website just now, I had no idea
that Lindholm also writes lap-breaker fantasy epics
as "Robin Hobb"; that answers
my question about whatever happened to her...
- Lois McMaster Bujold, Falling Free
- Early and comparatively unpolished Bujold, which I had somehow never read
before. It's not as masterful as her later works — in particular, the
characters are not as richly developed. But even early, lesser Bujold is deeply
entertaining. (The cover art of my old paperback copy is, as usual with this
publisher, needlessly horrid; I am tempted to buy
the NESFA Press
edition simply to replace it with
something
bearable.)
- Trey Shiels, The Dread Hammer
- Mind-candy; fantasy full of the sort of no-good-can-come-of-this behavior
you find in so many fairy tales, and for that matter epics. I will be reading
the sequel. — "Shiels" is the open pen-name of Linda Nagata, who wrote
some excellent hard science fiction novels in the 1990s and early 2000s, and
then, well, went away for a while. This
is not very much like her earlier books in theme or even style, but still
good.
- Patrick R. Laughlin, Group Problem Solving
- A summary of research by experimental social psychologists on problem
solving by groups of American college students, with special reference (not
unreasonably!) to the contributions of one P. R. Laughlin and collaborators.
These experiments are done
on WEIRD
subjects, and the problems are deliberately artificial, so there are the usual
worries about generalizing to other contexts. (Is problem solving by, say,
engineering designers really very much
like cryptarithmetic?)
But the experiments do nonetheless show some extremely interesting phenomena,
and a general pattern of minimally-organized groups doing as well or better
than the best individuals, under fairly careful controls. This book should
really be taken more as an extended (158 pp.) review paper than a comprehensive
treatise, and you have to brace yourself for a psychologist's idea of prose
(and indeed a psychologist's idea of what constitutes a "theoretical model";
the online first
chapter is representative in both respects), but it's a fast read, and full
of useful information for anyone concerned
with collective cognition.
(The price for the hard-back edition is, however, outrageous.)
- Duncan J. Watts, Everything Is Obvious, Once You Know the Answer
- I'll actually try to give this a full write-up later, but in the meanwhile
I will say: (1) this is great and recommended unreservedly; if you like this
weblog at all you should definitely read it;
(2) Tom
Slee's review
is very good; and (3) Duncan's been a friend
since Santa Fe days, so feel free to
discount my praise, but if I thought this was bad I'd just stay decently quiet
about it.
- Naomi Novik, Empire of Ivory
- Mind-candy; enjoyable continuation of
the series about dragons in the
Napoleonic wars, in which Our Heroes venture to Africa, and the forces of
European imperialism and the slave trade are righteously repelled. Of course,
given the situation Novik has set up in her version of Africa, there is no way
in Hell the trans-Atlantic slave trade could have begun in the first place; and
no slave trade means astoundingly different European colonies in the Americas,
if any at all, hence no French Revolution and no Napoleon. In short, the usual
problem with alternate histories. (On examination, as so
often, Timothy
Burke said it first, and better.) But I will still get the sequel, because
I want to know how she'll get her heroes out of the soup she lands them in at
the end. — It's been long enough since I read the earlier installments
that I found the catch-up parts welcome, and you could probably read this
without the previous books, but I'd recommend starting the series at
the beginning.
- Karin Slaughter, Fallen
- Absorbing, gruesome and wrenching as usual. I am not quite sure
that it matches the past laid out in earlier books in the series, but this
merely makes me want to go back and re-read them. (The coincidence of this
book's title with one of Kathleen
George's is I think due to the English language's sheer poverty of short,
vaguely ominous phrases. But by this point, Slaughter could call a
book Kittens and Flowers and it would fill me with apprehension.)
— Previously.
Books to Read While the Algae Grow in Your Fur;
Pleasures of Detection, Portraits of Crime;
Scientifiction and Fantastica;
Writing for Antiquity;
Enigmas of Chance;
Cthulhiana;
The Collective Use and Evolution of Concepts;
Minds, Brains, and Neurons;
Complexity;
Commit a Social Science;
Kith and Kin;
Philosophy;
Networks
Posted at August 31, 2011 23:59 | permanent link