Talks Next Week
Attention conservation notice: Only of interest if you (1)
like hearing people talk about statistics and machine learning, and (2) will be
in Pittsburgh next week.
I have been remiss about advertising upcoming talks.
- Mark Davenport, "To
Adapt or Not To Adapt: The Power and Limits of Adaptivity for Sparse
Estimation"
- Abstract: In recent years, the fields of signal processing,
statistical inference, and machine learning have come under mounting pressure
to accommodate massive amounts of increasingly high-dimensional data. Despite
extraordinary advances in computational power, the data produced in application
areas such as imaging, remote surveillance, meteorology, genomics, and large
scale network analysis continues to pose a number of challenges. Fortunately,
in many cases these high-dimensional signals contain relatively little
information compared to their ambient dimensionality. For example, signals can
often be well-approximated as sparse in a known basis, as a matrix having low
rank, or using a low-dimensional manifold or parametric model. Exploiting this
structure is critical to any effort to extract information from such data.
- In this talk I will overview some of my recent research on how to exploit
such models to recover high-dimensional signals from as few observations as
possible. Specifically, I will primarily focus on the problem of estimating a
sparse vector from a small number of noisy measurements. To begin, I will
consider the case where the measurements are acquired in a nonadaptive
fashion. I will establish a lower bound on the minimax mean-squared error of
the recovered vector which very nearly matches the performance of
$\ell1_$-minimization techniques, and hence shows that these techniques are
essentially optimal. I will then consider the case where the measurements are
acquired sequentially in an adaptive manner. I will prove a lower bound that
shows that, surprisingly, adaptivity does not allow for substantial improvement
over standard nonadaptive techniques in terms of the minimax MSE. Nonetheless,
I will also show that there are important regimes where the benefits of
adaptivity are clear and overwhelming.
- Time and place: 4--5 pm on Monday, 20 February 2012, in Scaife
Hall 125
- Ambuj Tewari, "From
Probabilistic to Game Theoretic Foundations for Learning and Prediction"
- Abstract: The probabilistic approach to prediction problems
assumes that the data is generated from an underlying stochastic process. A
reasonable goal then is to minimize the expected loss, or risk. The game
theoretic approach, in contrast, views prediction as a repeated game between
the learner and an adversary. The learner's goal then is to do well no matter
what strategy is followed by the adversary. Minimizing regret is one of the
well known ways to operationalize the notion of doing well. With a long history
in varied disciplines such as Computer Science, Economics, Information Theory,
and Statistics, the game theoretic approach has witnessed a vigorous
development. Yet the suite of standard tools available for the probabilistic
setting, such as Rademacher & Gaussian averages, covering numbers, and
combinatorial dimensions, was missing in the game theoretic setting. In this
talk, I will show how it is indeed possible to develop analogues of these tools
for the game theoretic setting. Unlike the probabilistic setting, where
empirical risk minimization is a canonical algorithm, we will not be able to
exhibit a corresponding canonical algorithm for the game theoretic
setting. However, under the additional assumption of convexity, I will show
that Mirror Descent, a classic algorithm from optimization theory, is a
canonical algorithm achieving minimax regret rates.
- (Talk is based on papers written jointly with Alexander Rakhlin, Nathan
Srebro, and Karthik Sridharan.)
- Time and place: 10--11 am on Wednesday, 22 February 2012, in Gates
Hall 6115
- Forrest W. Crawford, "Birth,
Death, Sex, Lies: Markov Counting Processes in Genetics and Beyond"
- Abstract: A general birth-death process (BDP) is a continuous-time
Markov chain that counts the number of particles in a system over time. At any
moment in time, a particle may give birth or die, and the rate at which these
events occur depends on the number of particles in the system at that time.
While widely used in population biology, genetics, and evolution, statistical
inference techniques for general BDPs remain elusive. In fact, the likelihood
of a discrete observation from many of these processes cannot be written in
closed form. In this talk, I outline several fundamental results that allow
computation of transition probabilities and maximum likelihood estimates for
general BDPs. I apply these novel methods to three important applied problems.
First, I describe a technique for determining the effect of antibody treatment
on the growth of lymphoma cells in vitro. Second, I investigate the evolution
of DNA microsatellites in humans and chimpanzees using a log-linear model for
the rates of repeat duplication and deletion. Finally, I use a BDP to infer
true counts of sex acts from rounded self-reported counts in a longitudinal
study of risky behaviors in young people living with HIV. These applications
illustrate the mathematical, statistical, and computational challenges involved
in learning from BDPs in biology, medicine, and public health.
- Time and place: 4--5 pm on Wednesday, 22 February 2012, in Scaife Hall 125
- Ron Bekkerman, "Scaling Up
Machine Learning"
- Abstract: In this talk, I'll provide an extensive introduction to parallel and distributed machine learning. I'll answer the questions "How actually big is the big data?", "How much training data is enough?", "What do we do if we don't have enough training data?", "What are platform choices for parallel learning?" etc. Over an example of k-means clustering, I'll discuss pros and cons of machine learning in Apache Pig, MPI, DryadLINQ, and CUDA. Time permitting, I'll take a dive into a super large scale text categorization task.
-
- Time and place: 1:30--2:30 pm on Thursday, 23 February 2012, in Newell-Simon Hall 1305
As always, the talks are free and open to the public.
(You see why I have trouble keeping up with these.)
Enigmas of Chance
Posted at February 19, 2012 12:30 | permanent link