Course Announcement: 36-350, Statistical Computing
Since the semester begins on Monday, I might as well admit to myself that I
am, in fact, teaching a new class:
- 36-350, Statistical Computing
- Instructors: Cosma Shalizi and Vincent Vu
- Description: Computational data analysis is an essential part of modern statistics.
Competent statisticians must not just be able to run existing programs, but to
understand the principles on which they work. They must also be able to read,
modify and write code, so that they can assemble the computational tools needed
to solve their data-analysis problems, rather than distorting problems to fit
tools provided by others. This class is an introduction to programming,
targeted at statistics majors with minimal programming knowledge, which will
give them the skills to grasp how statistical software works, tweak it to suit
their needs, recombine existing pieces of code, and when needed create their
own programs.
- Students will learn the core of ideas of programming — functions,
objects, data structures, flow control, input and output, debugging, logical
design and abstraction — through writing code to assist in numerical and
graphical statistical analyses. Students will in particular learn how to write
maintainable code, and to test code for correctness. They will then learn how
to set up stochastic simulations, how to parallelize data analyses, how to
employ numerical optimization algorithms and diagnose their limitations, and
how to work with and filter large data sets. Since code is also an important
form of communication among scientists, students will learn how to comment and
organize code.
- The class will be taught in the R
language.
- Pre-requisites: This is an introduction to programming for
statistics students. Prior exposure to statistical thinking, to data analysis,
and to basic probability concepts is essential, as is some prior acquaintance
with statistical software. Previous programming experience is not
assumed, but familiarity with the computing system is. Formally, the
pre-requisites are "Computing at Carnegie Mellon" (or consent of instructor),
plus one of either 36-202 or 36-208, with 36-225 as either a pre-requisite
(preferable) or co-requisite (if need be).
Further details, subject to change, at
the
class website.
Teaching materials will definitely be posted there, and may be posted here.
(For tedious reasons, this class has the same number as the
data-mining class I've
taught previously; that course is now numbered 36-462, and will be taught in
the spring by somebody else, while I'll be returning
to 36-402, advanced data analysis.)
Corrupting the Young;
Enigmas of Chance;
Introduction to Statistical Computing
Posted at August 24, 2011 23:58 | permanent link