For the first time, I will be teaching a section of the course which is the pre-requisite for my spring advanced data analysis class. This is an introduction to linear regression modeling for our third-year undergrads, and others from related majors; my section is currently eighty students. Course materials, if you have some perverse desire to read them, will be posted on the class homepage twice a week.

This course is the first one in our undergraduate sequence where the
students have to bring together probability, statistical theory, and analysis
of actual data. I have mixed feelings about doing this through linear models.
On the one hand, my experience of applied problems is that there are really
very few situations where the "usual" linear model assumptions can be
maintained in good conscience. On the other hand, I suspect it *is*
usually easier to teach people the more general ideas if they've thoroughly
learned a concrete special case first; and, perhaps more importantly, whatever
the merits of
(e.g.) Box-Cox
transformations might actually be, it's the sort of thing people will
expect statistics majors to know...

*Addendum*, later that night: I should have made it clear in the
first place that my syllabus is, up through the second exam, ~~ripped
off~~ borrowed with gratitude
from Rebecca Nugent, who
has taught
401 outstandingly for many years.

*Update*, since people have asked for it, links here (see the course page for the source files for lectures):

- Introduction to the course
- Homework 1
- About statistical modeling
- Simple linear regression models
- Homework 2
- Estimation by least squares for simple linear regression models
- Estimation by maximum likelihood for simple linear regression models
- Homework 3
- Diagnostics and remedies for simple linear regression models
- Parametric inference for simple linear regression
- Homework 4
- Predictive inference for simple linear regression
- F tests, $R^2$, and other distractions.
*Update*: What can I do but cackle? - Interpreting simple linear models after transformations
- Theory exam 1
- Data analysis project 1
- Simple linear regression and linear algebra
- Multiple regression
- Homework 5 (data sets:
`gpa.txt`,`)` - Diagnostics and inference for multiple regression
- Polynomial and categorical regression
- Homework 6 (
`SENIC.txt`data set) - Multicollinearity
- Testing and confidence sets for multiple coefficients
- Homework 7 (
`water.txt`data file) - Interactions
- Influential points and outliers
- Homework 8
- Model selection
- Midterm review
- Practice Exam 2
- Theory exam 2
- Data analysis project 2 (data file)
- Non-constant noise variance
- Correlated noise
- Variable selection
- Regression trees
- Homework 9
- Bootstrap
- Data analysis project 3

As post-mortems, some thoughts on the textbook and alternatives, and general lessons learned.

Posted at August 31, 2015 13:52 | permanent link