This semester, I have been going to the statistics department's reading group on learning theory, even though it meets at the unholy hour of 9 am on Thursdays. (Naturally, I learned about it from Kris.) Since the end of September, we've been reading a single paper, Peter Bartlett, Michael Jordan (no, no, the important one) and Jon McAuliffe's "Convexity, Classification and Risk Bounds" [abstract, pdf]. Moved by some perverse impulse, I volunteered to present Theorem 3 in this paper, and did so on October 9th.
This was most instructive. My explanation of what was going on in this theorem and its proof, which occupy two pages of large type, took me three days to write, ran to five densely-printed RevTeX pages, and went on for a full hour and a half. (I am told that this kind of compression ratio is typical of papers written for the Annals of Statistics.) In the (highly unlikely) that you, Dear Reader, would also like to understand that theorem, you can find my notes via the reading-group's web-page. (While there, you should really read Moulinath Banerjee's notes on covering numbers and VC dimension; these aren't involved in that theorem, but they're very important, and Mouli explains them well.)
I bring this up now because this week we had our final meeting on the paper, covering theorem 17, which is where they (at last!) get actual learning rates. We convinced ourselves that the proof was invalid, though the result may well be correct. (For want of a log, the bound was lost, etc.) This felt good, not because I like to see my betters stumble, but because I've finally become able to see what was happening in a proof like that, at least through the point where it went off the rails. I couldn't have done that a few years ago. It wasn't exactly a "your journey to the dark side is now complete" moment, but certainly it wasn't the kind of thing I was trained for by any means...
We're not sure what to read next; suggestions?
Posted at November 20, 2004 17:03 | permanent link