Attention conservation notice: Only of interest if you care a lot about computational statistics.
For our first seminar of the year, we are very pleased to have a talk which will combine two themes close to the heart of the statistics department:
As always, the talk is free and open to the public.
— A
slightly cynical historical-materialist
take on the rise of Bayesian statistics is that it reflects a phase in the
development of the means of computation, namely the PC era. The theoretical or
ideological case for Bayesianism was pretty set by the early 1960s, say with
Birnbaum's argument for the
likelihood principle1. It
nonetheless took a generation or more for Bayesian statistics to actually
become common. This is because, under the material conditions of the early 1960s, such ideas could be
only be defended and not applied.
What changed this was not better theory, or better models, or a sudden
awakening to the importance
of shrinkage
and partial pooling. Rather, it became possible to actually calculate
posterior distributions. Specifically, Monte Carlo methods developed in
statistical mechanics permitted stochastic approximations to non-trivial
posteriors. These Monte Carlo techniques quickly became (pardon the
expression) hegemonic within Bayesian statistics, to the point where I have met
younger statisticians who thought Monte Carlo was a Bayesian
invention2. One of the ironies of
applied Bayesianism, in fact, is that nobody actually knows the posterior
distribution which supposedly represents their beliefs, but rather
(nearly3) everyone works out that
distribution by purely frequentist inference from Monte Carlo samples.
("How do I know what I think until I see what the dice say?", as it were.)
So: if you could do Monte Carlo, you could work out (approximately) a posterior distribution, and actually do Bayesian statistics, instead of talking about it. To do Monte Carlo, you needed enough computing power to be able to calculate priors and likelihoods, and to do random sampling, in a reasonable amount of time. You needed a certain minimum amount of memory, and you needed clock speed. Moreover, to try out new models, to tweak specifications, etc., you needed to have this computing power under your control, rather than being something expensive and difficult to access. You needed, in other words, a personal computer, or something very like it.
The problem now is that while our computers keep getting faster, and their internal memory keeps expanding, our capacity to generate, store, and access data is increasing even more rapidly. This is a problem if your method requires you to touch every data point, and especially a problem if you not only have to touch every data point but do all possible pairwise comparisons, because, say, your model says all observations are dependent. This raises the possibility that Bayesian inference will become computationally infeasible again in the near future, not because our computers have regressed but because the size and complexity of interesting data sets will have rendered Monte Carlo infeasible. Bayesian data analysis would then have been a transient historical episode, belonging to the period when a desktop machine could hold a typical data set in memory and thrash through it a million times in a weekend.
Of course, I don't know that Bayesian inference is doomed to become obsolete because it will grow computationally intractable. One possibility is that "Bayesian inference" will be redefined in ways which depart further and further from the noble-Savage ideal, but are computationally tractable — variational Bayes, approximate Bayesian computation, and the generalized updating of Bissiri et al. are three (very different) moves in that direction. Another possibility is that algorithm designers are going to be clever enough to make distributed Monte Carlo approximations for posteriors as feasible as, say, a distributed bootstrap. This is, implicitly, the line Scott is pursuing. I wish him and those like him every success; whatever the issues with Bayesianism and some of its devotees, the statistical world would lose something valuable if Bayes as we know it were to diminish into a relic.
Update, 16 September 2013: It apparently needs saying that the ill-supported speculations here about the past and future of Bayesian computing are mine, not Dr. Scott's.
Update, 18 December 2013: "Asymptotically Exact, Embarrassingly Parallel MCMC" by Neiswanger et al. (arxiv:1311.4780) describes and analyses a very similar scheme to that proposed by Dr. Scott.
Manual trackback: Homoclinic Orbit
[1]: What Birnbaum's result actually proves is another story for another time; in the meanwhile, see Mayo, Evans and Gandenberger. ^
[2]: One such statistician persisted in this belief after reading Geyer and Thompson, and even after reading Metropolis et al., though there were other issues at play in his case. ^
[3]: The most interesting exception to this I know of is Rasmussen and Ghahramani's "Bayesian Monte Carlo" (NIPS, 2002). But despite its elegance and the reputations of its authors, it's fair to say this work has not had much impact. ^
Posted at September 14, 2013 23:22 | permanent link