Next Week at the Statistics Seminar; Week After Next at the Machine Learning Seminar
There's not much connection between the talks, other than that they should both be great, and I don't feel like writing two posts.
- Ronald Coifman,
"Analytic Organization of Observational Databases as a Tool for Learning and
Inference"
- Abstract: We describe a mathematical framework to learn and
organize databases without incorporation of expert information. The database
could be a matrix of a linear transformation for which the goal is to
reorganize the matrix so as to achieve compression and fast algorithms. Or the
database could be a collection of documents and their vocabulary, an array of
sensor measurements such as EEG, or a financial time series or segments of
recorded music. If we view the database as a questionnaire, we organize the
population into a contextual demographic diffusion geometry and the questions
into a conceptual geometry; this is an iterative process in which each
organization informs the other, with the goal of entropy reduction of the whole
data base.
- This organization being totally data agnostic applies to the other examples
thereby generating automatically a data driven conceptual/contextual pairing.
We will describe the basic underlying tools from Harmonic Analysis for
measuring success in extracting structure, tools which enable functional
regression prediction and basically signal processing methodologies.
- Time and Place: 4:30--5:30 pm on Monday, 26 September 2011 in Baker Hall, Giant Eagle Auditorium (A51)
- Alex Smola, "Scaling Machine Learning to the Internet"
- Abstract: In this talk I will give an overview over an array of
highly scalable techniques for both observed and latent variable models. This
makes them well suited for problems such as classification, recommendation
systems, topic modeling and user profiling. I will present algorithms for
batch and online distributed convex optimization to deal with large amounts of
data, and hashing to address the issue of parameter storage for personalization
and collaborative filtering. Furthermore, to deal with latent variable models
I will discuss distributed sampling algorithms capable of dealing with tens of
billions of latent variables on a cluster of 1000 machines.
- The algorithms described are used for personalization, spam filtering,
recommendation, document analysis, and advertising.
- Time and Place: 3--4 pm on Thursday, 6 October in Gates Hall
8102
As always, both talks are free and open to the public.
Enigmas of Chance
Posted at September 24, 2011 16:06 | permanent link