Attention conservation notice:Only relevant if you are a student at Carnegie Mellon University, or have a pathological fondness for reading lecture notes on statistics.

In the so-called spring, I will again be teaching 36-402 / 36-608, undergraduate advanced data analysis:

The goal of this class is to train you in using statistical models to analyze data — as data summaries, as predictive instruments, and as tools for scientific inference. We will build on the theory and applications of the linear model, introduced in 36-401, extending it to more general functional forms, and more general kinds of data, emphasizing the computation-intensive methods introduced since the 1980s. After taking the class, when you're faced with a new data-analysis problem, you should be able to (1) select appropriate methods, (2) use statistical software to implement them, (3) critically evaluate the resulting statistical models, and (4) communicate the results of your analyses to collaborators and to non-statisticians.During the class, you will do data analyses with existing software, and write your own simple programs to implement and extend key techniques. You will also have to write reports about your analyses.

Graduate students from other departments wishing to take this course should register for it under the number "36-608". Enrollment for 36-608 is very limited, and by permission of the professors only.

Prerequisites: 36-401, with a grade of C or better. Exceptions are only granted for graduate students in other departments taking 36-608.

This will be my fifth time teaching 402, and the fifth time where the
primary text is the draft
of Advanced Data
Analysis from an Elementary Point of View. (I hope my editor will
believe that I don't *intend* for my revisions to illustrate Zeno's
paradox.) It is the first time I will be co-teaching with the lovely and
talented Max
G'Sell.

*Unbecoming whining:* 402 will be larger this year than last, just
like it has been every year I've been here. This year, in fact, we'll
have over 150 students in it, or about 1/50 of
all CMU undergrads. (This has nothing to do with my teaching, and everything
to do with our student population.) I think it's great that we're
teaching what would be masters-level material at most schools to so many
juniors and seniors, but I don't think we'll be able to keep doubling every
five years without either having a lot of stuff break, or transforming
the nature of the course yet again. It's clearly a better problem to have than
"class sizes are halving every five years"*, but it's still a problem.

*: As I have said in a number of conversations over recent years, the nightmare scenario for statistics vs. "data science" is that statistics becomes a sort of mathematical analog to classics. People might pay lip-service to our value, especially people who are invested in pretending to intellectual rigor, but few would actually pay attention to anything we have to say.

