Course Announcement: 36-350, Data Mining, Fall 2009
Since the semester starts in a lamentably small number of days:
- Title: 36-350, Statistical Data Mining
- Prereqs: One of 36-226, 36-310, 36-625, or consent of
instructor. In addition, familiarity with vectors and matrices, and comfort with programming, will be very helpful.
- Lectures: MWF 10:30--11:20, Porter Hall 226B. (The on-line class schedule thinks the Friday lecture is a lab; it's wrong.)
- Course description:
- Data mining is the art of extracting useful patterns from large bodies of
data; finding seams of actionable knowledge in the raw ore of information. The
rapid growth of computerized data, and the computer power available to analyze
it, creates great opportunities for data mining in business, medicine, science,
government, etc. The aim of this course is to help you take advantage of these
opportunities in a responsible way. After taking the class, when you're faced
with a new problem, you should be able to (1) select appropriate methods, and
justify their choice, (2) use and program statistical software (i.e., R) to
implement them, and (3) critically evaluate the results and communicate them to
colleagues in business, science, etc.
- Data mining is related to statistics and to machine learning, but has its
own aims and scope. Statistics is a mathematical science, studying how reliable
inferences can be drawn from imperfect data. Machine learning is a branch of
engineering, developing a technology of automated induction. We will freely use
tools from statistics and from machine learning, but we will use them as tools,
not things to study in their own right. We will do a lot of calculations, but
will not prove many theorems, and we will do even more experiments than
calculations.
The current topic outline, the grading policy, etc., can all be
found on the class webpage.
This will mostly be very similar to the 2008 iteration of the
class, since it seemed to work, with some modifications in
light of that experience. Podcast lectures are probably not going
to happen, owing to technical incompetence on my part.
(Oh, and in case you're wondering: I'm behind on answering everyone else's
email too, not just yours.)
Corrupting the Young;
Enigmas of Chance;
Self-Centered
Posted at August 17, 2009 14:17 | permanent link