### Lecture: The Bootstrap (Advanced Data Analysis from an Elementary Point of View)

The sampling distribution is the source of all knowledge regarding
statistical uncertainty. Unfortunately, the true sampling distribution is
inaccessible, since it is a function of exactly the quantities we are trying to
infer. One exit from this vicious circle is the bootstrap principle:
*approximate* the true sampling distribution by *simulating* from
a good model of the process, and treating the simulation data just like the
data. The simplest form of this is parametric bootstrapping, i.e., simulating
from the fitted model. Nonparametric bootstrapping means simulating by
re-sampling, i.e., by treating the observed sample as a complete population and
drawing new samples from it. Bootstrapped standard errors, biases, confidence
intervals, p-values. Tricks for making the simulated distribution closer to
the true sampling distribution (pivotal intervals, studentized intervals, the
double bootstrap). Bootstrapping regression models: by parametric
bootstrapping; by resampling residuals; by resampling cases. Many, many
examples. When does the bootstrap fail?

*Note:* Thanks to Prof. Christopher Genovese for delivering this
lecture while I was enjoying the hospitality of the fen-folk.

*Reading*:
Notes, chapter 6
(R for figures and examples;
`pareto.R`;
`wealth.dat`);

Lecture slides;
R for in-class examples

Cox and Donnelly, chapter 8

Advanced Data Analysis from an Elementary Point of View

Posted at January 31, 2013 10:30 | permanent link