August 14, 2019

Course Announcement: Data Mining (36-462/662), Fall 2019

For the first time in ten years, I find myself teaching data mining in the fall. This means I need to figure out what data mining is in 2019. Naturally, my first stab at a syllabus is based on what I thought data mining was in 2009. Perhaps it's changed too little; nonetheless, I'm feeling OK with it at the moment*. I am sure the thoughtful and constructive suggestions of the Internet will only reinforce this satisfaction.

--- Seriously, suggestions are welcome, except for suggesting that I teach about neural networks, which I deliberately omitted because I am an out-of-date stick-in-the-mud reasons**.

*: Though I am not done selecting readings from the textbook, the recommended books, and sundry articles --- those will however come before the respective classes. I have been teaching long enough to realize that most students, particularly in a class like this, will read just enough of the most emphatically required material to think they know how to do the assignments, but there are exceptions, and anecdotally even some of thoe majority come back to the material later, and benefit from pointers. ^

**: On the one hand, CMU (now) has plenty of well-attended classes on neural networks and deep learning, so what would one more add? On the other, my admittedly cranky opinion is that we have no idea why the new crop works better than the 1990s version, and it's not always clear that they do work better than good old-fashioned machine learning, so there.

Corrupting the Young; Enigmas of Chance

Posted at August 14, 2019 17:17 | permanent link

August 06, 2019

Notes on "Intriguing Properties of Neural Networks", and two other papers (2014)

\[ \DeclareMathOperator*{\argmax}{argmax} \]

Attention conservation notice: Slides full of bullet points are never good reading; why would you force yourself to read painfully obsolete slides (including even more painfully dated jokes) about a rapidly moving subject?

These are basically the slides I presented at CMU's Statistical Machine Learning Reading Group on 13 November 2014, on the first paper on what have come to be called "adversarial examples". It includes some notes I made after the group meeting on the Q-and-A, but I may not have properly credited (or understood) everyone's contributions even at the time. It also includes some even rougher notes about two relevant papers that came out the next month. Presented now because I'm procrastinating preparing for my fall class in the interest of the historical record.

Paper I: "Intriguing properties of neural networks" (Szegedy et al.)

Background

Where Are the Semantics?

The Learned Classifier Isn't Perceptually Continuous

How Can This Be?

Paper II: "Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images" (Nguyen, Yosinski and Clune)

"Evolved images that are unrecognizable to humans, but that state-of-the-art DNNs trained on ImageNet believe with \(\geq 99.6\)% certainty to be a familiar object. ... Images are either directly (top) or indirectly (bottom) encoded."

It'd seem in-principle possible to use the technique of this paper to evolve images for maximal response from these cells, if you could get the monkey to tolerate the apparatus for long enough.

Paper III: "Visual Causal Feature Learning" (Chalupka, Perona and Eberhardt)


  1. I guess a more purely Greek phrase would be "isotaxon", but that's just a guess.^

  2. i.e., I learned about this paper from Shallice and Cooper's excellent The Organisation of Mind, but felt dumb for not knowing about it before.^

Posted at August 06, 2019 15:17 | permanent link

Three-Toed Sloth