November 02, 2011

First and Second City Sloth

I mentioned trips for upcoming talks, didn't I?

"When Can We Learn Exponential Random Graph Models from Samples?"
Abstract: Typically, statistical models of network structure are are models for the entire network, while the data is only a sampled sub-network. Parameters for the whole network, which are what we care about, are estimated by fitting the model on the sub-network. This assumes that the model is "consistent under sampling", or, in terms of the theory of stochastic processes, that it forms a projective family. For the deservedly-celebrated class of exponential random graph models (ERGMs), this apparently trivial condition is in fact violated by many popular and scientifically appealing models; satisfying it drastically limits ERGM's expressive power. These results are special cases of more general ones about exponential families of dependent variables, which we also prove. As a consolation prize, we offer easily checked conditions for the consistency of maximum likelihood estimation in ERGMs, and discuss some possible constructive responses. (Joint work with Alessandro Rinaldo.)
Time and place: 4--5 pm on Thursday, 3 November 2011, in the "Interschool Lab", room 750 in the Schapiro Center for Engineering and Physical Science Research, Columbia University
"Nonparametric Bounds on Time Series Prediction Risk for Model Evaluation and Selection", University of Chicago Econometrics and Statistics Seminar
Abstract: Everyone wants their time series model to predict well. Since how well it did on the data you used to fit it exaggerates how well it can be expected to do in the future, and since penalties like AIC are only correct asymptotically (if then), controlling prediction risk with finite data needs something different. Combining tools from machine learning and ergodic theory lets us build a bound on prediction risk for state-space models in terms of historical performance, a measure of the model's capacity to fit arbitrary data, and a measure of how much information is actually in the time series. The result applies even at small samples, places minimal restrictions on the data source, and is agnostic about mis-specification. These bounds can then be used to evaluate and compare models. (Joint work with Daniel McDonald and Mark Schervish.)
Time and place: 1:30--2:50 pm on Thursday, 17 November 2011, in "HC 3B" (I don't know where that is but hopefully I will by then)

The Columbia talk is free and open to the public. I will be disillusioned unless Chicago not only charges admission, but uses a carefully optimized scheme of price discrimination.


Posted at November 02, 2011 10:30 | permanent link

Three-Toed Sloth