February 04, 2011

Density Estimation (Advanced Data Analysis from an Elementary Point of View, Lecture 6)

The desirability of estimating not just conditional means, variances, etc., but whole distribution functions. Parametric maximum likelihood is a solution, if the parametric model is right. Histograms and empirical cumulative distribution functions are non-parametric ways of estimating the distribution: do they work? The Glivenko-Cantelli law on the convergence of empirical distribution functions, a.k.a. "the fundamental theorem of statistics". More on histograms: they converge on the right density, if bins keep shrinking but the number of samples per bin keeps growing. Kernel density estimation and its properties; some error analysis. An example with data from the homework. Estimating conditional densities; another example with homework data. Some issues with likelihood, maximum likelihood, and non-parametric estimation.

PDF

R

Advanced Data Analysis from an Elementary Point of View

Posted at February 04, 2011 01:35 | permanent link

Three-Toed Sloth