Class Announcement: 36-350, Statistical Computing, Fall 2014
Fourth time is charm:
- 36-350, Statistical Computing
- Instructors: Yours truly and Andrew Thomas
- Description: Computational data analysis is an essential part of
modern statistics. Competent statisticians must not just be able to run
existing programs, but to understand the principles on which they work. They
must also be able to read, modify and write code, so that they can assemble the
computational tools needed to solve their data-analysis problems, rather than
distorting problems to fit tools provided by others. This class is an
introduction to programming, targeted at statistics majors with minimal
programming knowledge, which will give them the skills to grasp how statistical
software works, tweak it to suit their needs, recombine existing pieces of
code, and when needed create their own programs.
- Students will learn the core of ideas of programming — functions,
objects, data structures, flow control, input and output, debugging, logical
design and abstraction — through writing code to assist in numerical and
graphical statistical analyses. Students will in particular learn how to write
maintainable code, and to test code for correctness. They will then learn how
to set up stochastic simulations, how to parallelize data analyses, how to
employ numerical optimization algorithms and diagnose their limitations, and
how to work with and filter large data sets. Since code is also an important
form of communication among scientists, students will learn how to comment and
organize code.
- The class will be taught in the R
language, use RStudio for labs,
and R Markdown for assignments.
- Pre-requisites: This is an introduction to programming for
statistics students. Prior exposure to statistical thinking, to data analysis,
and to basic probability concepts is essential, as is some prior acquaintance
with statistical software. Previous programming experience is not
assumed, but familiarity with the computing system is. Formally, the
pre-requisites are "Computing at Carnegie Mellon" (or consent of instructor),
plus one of either 36-202 or 36-208, with 36-225 as either a pre-requisite
(preferable) or co-requisite (if need be).
- The class may be unbearably redundant for those who already know a
lot about programming. The class will be utterly incomprehensible for
those who do not know statistics and probability.
Further details can be found at
the class website.
Teaching materials (lecture slides, homeworks, labs, etc.), will appear both
there and here.
— The class is much bigger than in any previous year --- we currently
have 50 students enrolled in two back-to-back lecture sections, and another
twenty-odd on the waiting list, pending more space for labs. Most of the ideas
tossed out in my last self-evaluation are going to be
at least tried; I'm particularly excited about pair programming for the labs.
Also, I at least am enjoying re-writing the lectures
in R
Markdown's presentation mode.
Manual trackback: Equitablog
Corrupting the Young;
Enigmas of Chance;
Introduction to Statistical Computing
Posted at August 25, 2014 10:30 | permanent link