First and Second City Sloth
I mentioned trips for upcoming talks, didn't I?
- "When Can We Learn Exponential Random Graph Models from Samples?"
- Abstract: Typically, statistical models of network structure are are models for the entire network, while the data is only a sampled sub-network. Parameters for the whole network, which are what we care about, are estimated by fitting the model on the sub-network. This assumes that the model is "consistent under sampling", or, in terms of the theory of stochastic processes, that it forms a projective family. For the deservedly-celebrated class of exponential random graph models (ERGMs), this apparently trivial condition is in fact violated by many popular and scientifically appealing models; satisfying it drastically limits ERGM's expressive power. These results are special cases of more general ones about exponential families of dependent variables, which we also prove. As a consolation prize, we offer easily checked conditions for the consistency of maximum likelihood estimation in ERGMs, and discuss some possible constructive responses. (Joint work with Alessandro Rinaldo.)
- Time and place: 4--5 pm on Thursday, 3 November 2011, in the
"Interschool Lab", room 750 in the Schapiro Center for Engineering and Physical
Science Research, Columbia University
- "Nonparametric Bounds on Time Series Prediction Risk for Model Evaluation and Selection", University of Chicago Econometrics and Statistics Seminar
- Abstract: Everyone wants their time series model to predict well.
Since how well it did on the data you used to fit it exaggerates how well it
can be expected to do in the future, and since penalties like AIC are only
correct asymptotically (if then), controlling prediction risk with finite data
needs something different. Combining tools from machine learning and ergodic
theory lets us build a bound on prediction risk for state-space models in terms
of historical performance, a measure of the model's capacity to fit arbitrary
data, and a measure of how much information is actually in the time series.
The result applies even at small samples, places minimal restrictions on the
data source, and is agnostic about mis-specification. These bounds can then be
used to evaluate and compare models. (Joint work with Daniel McDonald and Mark
- Time and place: 1:30--2:50 pm on Thursday, 17 November 2011, in "HC 3B" (I don't know where that is but hopefully I will by then)
The Columbia talk is free and open to the public. I will be disillusioned
unless Chicago not only charges admission, but uses a carefully optimized
scheme of price discrimination.
Posted at November 02, 2011 10:30 | permanent link