## Indirect Inference

*27 Feb 2017 16:30*

A technique of parameter estimation for simulation models. You go and build a stochastic generative model of your favorite process or assemblage, and, being a careful scientist, you do a conscientious job of trying to include what you guess are all the most important mechanisms. The result is something you can step through to produce a simulation of the process of interest. But your model contains some unknown parameters, let's say generically \( \theta \), and you would like to tune those to match the data — or see if, despite your best efforts, there are aspects of the data which your model just can't match.

Very often, you will find that your model is too complicated for you to
appeal to any of the usual estimation methods
of statistics. Because you've been aiming for
scientific adequacy rather than statistical tractability, it will often happen
that there is no way to even calculate the likelihood of a given data
set \( x_1, x_2, \ldots x_t \equiv x_1^t \) under
parameters \( \theta \) in closed form, which would rule out even
numerical likelihood maximization, to say nothing of Bayesian methods, should
you be into them. (For concreteness, I am writing as though the data were just
a time series, possibly vector-valued, but the
ideas adapt in the obvious way to spatial processes or more complicated
formats.) Yet you *can* simulate; it seems like there should be some
way of saying whether the simulations look like the data.

This is where indirect inference comes in, with what I think is a really
brilliant idea. Introduce a *new* model, called the "auxiliary model",
which is mis-specified and typically not even generative, but
is *easily* fit to the data, and to the data alone. (By that last I
mean that you don't have to impute values for latent variables, etc., etc.,
even though you might know those variables exist and are causally important.)
The auxiliary model has its own parameter vector \( \beta \), with an
estimator \( \hat{\beta} \). These parameters describe aspects of the
distribution of observables, and the idea of indirect inference is that we can
estimate the generative parameters \( \theta \) by trying to
match *those* aspects of observations, by trying to match
the *auxiliary* parameters.

On the one side, start with the data \( x_1^t \) and get auxiliary parameter estimates \( \hat{\beta}(x_1^t) \equiv \hat{\beta}_t \). On the other side, for each \( \theta \) we can generate a simulated realization \( \tilde{X}_1^t(\theta) \) of the same size (and shape, if applicable) as the data, leading to auxiliary estimates \( \hat{\beta}(\tilde{X}_1^t(\theta)) \equiv \tilde{\beta}_t(\theta) \). The indirect inference estimate \( \hat{\theta} \) is the value of \( \theta \) where \( \tilde{\beta}_t(\theta) \) comes closest to \( \hat{\beta}_t \). More generally, we can introduce a (symmetric, positive-definite) matrix \( \mathbf{W} \) and minimize the quadratic form \[ \left(\hat{\beta}_t - \tilde{\beta}_t(\theta)\right) \cdot \mathbf{W} \left(\hat{\beta}_t - \tilde{\beta}_t(\theta)\right) \] with the entries in the matrix chosen to give more or less relative weight to the different auxiliary parameters.

The remarkable thing about this is that it works, in the sense of giving consistent parameter estimates, under not too strong conditions. Suppose that the data really are generated under some parameter value \( \theta_0 \); we'd like to see \( \hat{\theta} \rightarrow \theta_0 \). (Estimating the pseudo-truth in a mis-specified model works similarly but is more complicated than I feel like going into right now.) Sufficient conditions for this are that

- the auxiliary estimates converge \[ \tilde{\beta}_t(\theta) \rightarrow \beta(\theta) \] uniformly in \( \theta \), and
- the function \( \beta(\theta) \) is invertible.

Basically, these mean that the set of auxiliary parameters have to be rich enough to characterize or distinguish the different values of the generative parameters, and we need to be able to consistently estimate the former. This means we need at least as many auxiliary parameters as generative ones, so auxiliary models tend to be ones where it's easy to keep loading on parameters. (Adding too many auxiliary parameters does lead to loss of efficiency, however.) If \( \beta(\theta) \) is also differentiable in \( \theta \), and some additional regularity conditions hold, then we even get asymptotic Gaussian errors, with the matrix of partial derivatives \( \partial \beta_i/\partial \theta_j \) playing a role like the Fisher information matrix. — I can't resist adding that the usual conditions quoted for the consistency of indirect inference are stronger, and that these come from a chapter in the dissertation of my student Linqiao Zhao.

I think this is a really, really powerful idea, and one which should be much more widely adopted by people working with simulation models. In particular, one of my Cunning Plans is to make it work for agent-based modeling, and especially for models of social network formation.

A topic of particular interest to me is how to use non-parametric estimators, of regression or density curves say, as the auxiliary models, since then there is never any problem of having too few auxiliary parameters (though they might still be insensitive to the generative parameters, if one is looking the wrong curves. Nickl and Pötscher, below, have some initial results in this direction.

("Approximate Bayesian computation" is a very similar idea, but
where ~~the plain truth of the evidence is corrupted by
prejudice~~ a prior distribution is used to stabilize estimates, at some
cost in sensitivity. I need to learn more about it.)

- Recommended, big-picture:
- Mark C. Beaumont, "Approximation Bayesian Computation
in Evolution and Ecology", Annual Review of Ecology and Systematics
**41**(2010): 379--406 [PDF preprint kindly provided to me by Prof. Beaumont] - Christian Gouriéroux and Alain Monfort, Simulation-Based Econometric Methods [Review: By Indirection Find Direction Out]
- Christian Gouriéroux, Alain Monfort and E. Renault,
"Indirect Inference", Journal of Applied Econometrics
**8**(1993): S85--S118 [JSTOR] - A. A. Smith, "Indirect Inference", in The New Palgrave Dictionary of Economics 2nd edition [PDF preprint]

- Recommended, close-ups:
- Bruce E. Kendall, Stephen P. Ellner, Edward Mccauley, Simon
N. Wood, Cheryl J. Briggs, William W. Murdoch and Peter Turchin "Population
Cycles in the Pine Looper Moth: Dynamical Tests of Mechanistic
Hypotheses", Ecological Monographs
**75**(2005): 259--276 [PDF reprint. I learned about indirect inference by hearing Prof. Ellner talk about this paper at the 2007 Montreal workshop on statistics for dynamical systems.] - Richard Nickl, Benedikt M. Pötscher, "Efficient Simulation-Based Minimum Distance Estimation and Indirect Inference", arxiv:0908.0433
- Simon N. Wood, "Statistical inference for noisy nonlinear ecological dynamic systems", Nature
**466**(1102--1104)

- Modesty forbids me to recommend:
- The lecture on indirect inference in my complexity and statistics course (currently number 19)

- Pride compels me to recommend:
- Linqiao Zhao, A Model of Limit-Order Book Dynamics and a Consistent Estimation Procedure, Ph.D. thesis, Statistics Department, Carnegie Mellon University, 2010 [PDF draft]

- To read:
- Chris P. Barnes, Sarah Filippi, Michael P.H. Stumpf, Thomas Thorne, "Considerate Approaches to Achieving Sufficiency for ABC model selection",
Statistics and Computing
**22**(2012): 1181--1197, arxiv:1106.6281 - Johanna Bertl, Gregory Ewing, Carolin Kosiol, Andreas Futschik, "Approximate Maximum Likelihood Estimation", arxiv:1507.04553
- Michael G. B. Blum, "Approximate Bayesian Computation: A Nonparametric Perspective", Journal of the American Statistical Association
**105**(2010): 1178--1187 - M. G. B. Blum, M. A. Nunes, D. Prangle, and S. A. Sisson, "A Comparative Review of Dimension Reduction Methods in Approximate Bayesian Computation",
Statistical Science
**28**(2013): 189--208 - Carles Bretó, Daihai He, Edward L. Ionides, Aaron A. King, "Time series analysis via mechanistic models", Annals of Applied Statistics
**3**(2009): 319--348, arxiv:0802.0021 - Marianne Bruins, James A. Duffy, Michael P. Keane, Anthony A. Smith Jr, "Generalized Indirect Inference for Discrete Choice Models", arxiv:1507.06115
- Giovanni Luca Ciampaglia, "A framework for the calibration of social simulation models", Advances in Complex Systems accepted, arxiv:1305.3842 [At last, somebody's doing this!!!]
- D. R. Cox and Christiana Kartsonaki, "The fitting of complex parametric models", Biometrika
**99**(2012): 741--747 - Veronika Czellar and Elvezio Ronchetti, "Accurate and robust tests for indirect inference", Biometrika
**97**(2010): 621--630 - Pierre Del Moral, Arnaud Doucet and Ajay Jasra, "An Adaptive Sequential Monte Carlo Method for Approximate Bayesian Computation" [PDF preprint]
- Christopher C. Drovandi, Anthony N. Pettitt, Malcolm J. Faddy,
"Approximate Bayesian computation using indirect inference",
Journal of the Royal Statistical
Society C
**60**(2011): 317--337 - Christopher C. Drovandi, Anthony N. Pettitt, and Anthony Lee,
"Bayesian Indirect Inference Using a Parametric Auxiliary Model",
Statistical Science
**30**(2015): 72--95 - Paul Fearnhead, Dennis Prangle, "Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation", Journal of the Royal Statistical Society B
**74**(2012): 419--474 - Jean-Jacques Forneron, Serena Ng, "The ABC of Simulation Estimation with Auxiliary Statistics", arxiv:1501.01265
- Florian Gach, Benedikt M. Pötscher, "Non-Parametric Maximum Likelihood Density Estimation and Simulation-Based Minimum Distance Estimators", arxiv:1012.3851
- Mark Girolami, Anne-Marie Lyne, Heiko Strathmann, Daniel Simpson, Yves Atchade, "Playing Russian Roulette with Intractable Likelihoods", arxiv:1306.4032
- Aude Grelaud, Christian Robert, Jean-Michel Marin, Francois Rodolphe, Jean-Francois Taly, "ABC likelihood-freee methods for model choice in Gibbs random fields", arxiv:0807.2767
- Ajay Jasra, Nikolas Kantas, Elena Ehrlich, "Approximate Inference for Observation Driven Time Series Models with Intractable Likelihoods", arxiv:1303.7318
- J.-M. Marin, N. Pillai, C. P. Robert, J. Rousseau, "Relevant statistics for Bayesian model choice", arxiv:1110.4700
- Jean-Michel Marin, Pierre Pudlo, Christian P. Robert, Robin Ryder, "Approximate Bayesian Computational methods", arxiv:1101.0955
- Umberto Picchini, "Inference for SDE models via Approximate Bayesian Computation", arxiv:1204.5459
- Dennis Prangle, Paul Fearnhead, Murray P. Cox, Patrick J. Biggs, Nigel P. French, "Semi-automatic selection of summary statistics for ABC model choice", arxiv:1302.5624
- Oliver Ratmann, Anton Camacho, Adam Meijer, Gé Donker, "Statistical modelling of summary values leads to accurate Approximate Bayesian Computations", arxiv:1305.4283 [Sounds a bit like what Wood does in his Nature paper]
- Oliver Ratmann, Pierre Pudlo, Sylvia Richardson, Christian Robert, "Monte Carlo algorithms for model assessment via conflicting summaries", arxiv:1106.5919
- F. J. Rubio, Adam M. Johansen, "A Simple Approach to Maximum Intractable Likelihood Estimation", Electronic Journal of Statistics
**7**(2013): 1632--1654, arxiv:1301.0463 - Guosheng Yin, Yanyuan Ma, Faming Liang, and Ying Yuan, "Stochastic Generalized Method of Moments", Journal of Computational and Graphical Statistics forthcoming (2011) [Fast stochastic optimization for GMM --- applicable to II as well?]

*Previous versions*: 2010-09-19 21:17