Attention conservation notice: > 600 words on how I'd teach this semester's course differently next time.

So; grades are done, and, a decent interval after submitting the grades, I got the (anonymized) student evaluations. (Five of the eighteen students bothered to fill them out.) This seems like a good time to take a look at how things went.

Overall, I'm pleased with the semester. Their grades were quite good,
and actual performance on the final exam was even better than I'd hoped —
several students who'd done poorly on the homework pulled off really good
exams, and nobody did much *worse* on the exam than on the homework.
Most importantly, judging by what people wrote for the final, lots of them
actually understood what I was trying to say. (Of course, I didn't give them a
version of the final exam at the start of the class, so maybe they all knew it
already.) I'm also reasonably satisfied with the choice of materials, and
definitely think that replacing the weekly lab sessions with an extra lecture
was the right thing to do.

Of course it wasn't all good. While linear algebra is not a pre-req for the class, I was still surprised at how unfamiliar many of the students were with it. The difficulty being that it is very, very hard to say anything about high-dimensional data without linear algebra. Some of them of course had no problem; perhaps I need a pre-test at the start, with catch-up reading for those without the background. (Making linear algebra an official pre-req doesn't seem like an option.)

The big issue, both from my point of view and according to three of the five
students who bothered to write evaluations, were the programming assignments.
These were *much* harder for them, especially for the bottom half of the
class, than I had anticipated. In fact they *kept* being harder than I
anticipated, so I really need to dial down the initial programming
expectations, *and* include more programming instruction. (See
previous post.) I am not sure what to cut to make room
for this; the best approach might be to integrate demos and code walk-throughts
with some lectures. Teaching them data-mining *without* getting their
hands dirty, however, seems like a travesty.

Student participation also needs work. Out of eighteen students, there were, to first order, three who spoke up in lecture. (To second order, maybe six.) This was not a problem with them, but rather I should have done more to encourage the others to talk. Likewise, only three students came to office hours.

Some more specific things to work on, in no particular order:

- The "waste, fraud and abuse" theme never really materialized. I can't see how the class was any the worse for this, though.
- A bunch of the hand-outs are still heavily based on Tom Minka's old notes; re-write. (There were actually complaints about the style clash!)
- The information-theory lectures were very confusing to the students. Re-work? Shorten? Drop?
- Systematially use
the
`np`package for nonparametric regression and density estimation (and, with discrete response variables, classification). Abandon`ksmooth`,`lowess`, etc., for anything except decorating plots. - Cut neural networks?
- I definitely need to cover mixture models and density estimation.
- Additive models should probably be a lecture (or two?) of their own, not just a (much-kvetched-about) homework assignment.
- Consider specifically talking about successive-approximation algorithms and their convergence.
- Actually give the causal-inference lectures! (And see if someone from downstairs can't be persuaded to demo Tetrad. [Better yet, bug them some more about making it an R package.])
- Rigorously enforce the rule against just turning in printed-out R sessions in homework.
- Prepare a handout on "Things I expect you to pretend to remember from intro stats"? (What should be on it?)
- More real-data examples? On the one hand, they're convincing; on the
other hand,
*real*real data is big, messy, ambiguous, and not-infrequently expensive. But on the third hand, perhaps*that*is a lesson which should be driven home. - Replace the newsgroups example with text which doesn't take so much
back-story to explain. (Also, a larger corpus.)
**Update**, 14 Feb. 2009: K. suggests using Wikipedia, which is available for download. Probably still needs some pre-processing.

**Update**, 16 March 2009: A nice sequence might be: PCA
(subtracting off successive principal components), to the
coordinate-descent/back-fitting approach to linear regression, to the
coordinate descent lasso, to additive models,
to SpAM. But this will need a lot
of linear algebra, and the middle steps are impractical.

Posted at December 28, 2008 10:55 | permanent link