### Lecture: Additive Models (Advanced Data Analysis from an Elementary Point of View)

The "curse of dimensionality" limits the usefulness of fully non-parametric
regression in problems with many variables: bias remains under control, but
variance grows rapidly with dimension. Parametric models do not have this
problem, but have bias and do not let us discover anything about the true
function. Structured or constrained non-parametric regression compromises, by
adding some bias so as to reduce variance. Additive models are an example,
where each input variable has a "partial response function", which add together
to get the total regression function; the partial response functions are
unconstrained. This goes beyond linear models but still evades the curse of
dimensionality. Fitting additive models is done iteratively, starting with
some initial guess about each partial response function and then doing
one-dimensional smoothing, so that the guesses correct each other until a
self-consistent solution is reached. (This is like automatically finding an
optimal transformation of each predictor before putting it into a linear
model.) Examples in R using the California house-price data. Conclusion:
there are no statistical reasons to prefer linear models to additive models,
hardly any scientific reasons, and increasingly few computational ones; the
continued thoughtless use of linear regression is a scandal.

*Reading*: Notes, chapter 9

*Optional reading*: Faraway, chapter 12

Advanced Data Analysis from an Elementary Point of View

Posted at February 14, 2013 10:30 | permanent link