## Statistics

*12 Jan 2019 15:25*

An application of probability, with intimate ties to machine learning, non-demonstrative inference and induction.

Since June 2005, I have been a (very, very junior) professor of statistics. This made me interested in how to teach it.

*See
also*: Properties
vs. principles in defining "good statistics"

- Dependent data
- Statistical inference for stochastic processes, a.k.a. time-series analysis. Signal processing and filtering. Spatial statistics.
- Model selection
- Especially: adapting to unknown characteristics of the data, like unknown noise distributions, or unknown smoothness of the regression function.
- Model discrimination
- That is, designing experiments so as to discriminate between competing classes of model. Adaptation to data issues here.
- Rates of convergence of estimators to true values
- Empirical process theory. (Cf. some questions in ergodic theory).
- Estimating distribution functions
- And estimating entropies, or other functionals of distributions.
- Non-parametric methods
- Both those that are genuinely distribution-free, and those that would more accurately be mega-parametric (even infinitely-parametric) methods, such as neural networks
- Regression
- Bootstrapping and other resampling methods
- Cross-validation
- Sufficient statistics
- Exponential families
- Information Geometry
- Partial identification of parametric statistical models
- Causal Inference
- Decision theory
- Conventional, and the sorts with some connection to how real decisions are made.
- Graphical models
- Monte Carlo and other simulation methods
- "De-Bayesing"
- Ways of taking Bayesian procedures and eliminating dependence on priors, either by replacing them by initial point-estimates, or by showing the prior doesn't matter, asymptotically or hopefully sooner. See: Frequentist consistency of Bayesian procedures.
- Computational Statistics
- Statistics of structured data
- Statistics on manifolds
- i.e., what to do when the data live in a continuous but non-Euclidean space.
- Grammatical Inference
- Factor analysis
- Mixture models
- Multiple testing
- Predictive distributions
- ... especially if they have confidence/coverage properties
- Density estimation
- especially
*conditional*density estimation; and density estimation on graphical models - Indirect inference
- "Missing mass" and species abundance problems
- I.e., how much of the distribution have we not yet seen?
- Independence Tests, Conditional Independence Tests, Measures of Dependence and Conditional Dependence
- Two-Sample Tests
- Statistical Emulators for Simulation Models
- Hilbert Space Methods for Statistics and Probability

*Things I need to learn more about*:

- Recommended, non-technical:
- Jordan Ellenberg, How Not to Be Wrong: The Power of Mathematical Thinking
- Francis Galton, "Statistical Inquiries into the Efficacy of
Prayer," Fortnightly Review
**12**(1872): 125--135 [online] - Larry Gonick and Woollcott Smith, The Cartoon Guide to Statistics
- Ian Hacking, The Taming of Chance
- D. Huff, How to Lie with Statistics
- Theodore Porter, The Rise of Statistical Thinking, 1820--1900
- Constance Reid, Neyman from Life [Biography of Jerzy
Neyman, one of
*the*makers of modern statistical theory, and, I am happy to say, among the brighter lights of my alma mater. Reid does an excellent job of*explaining*Neyman's work in terms accessible to the general reader. There is a new edition, titled simply Neyman, but otherwise unchanged. Review by Steve Laniel] - Edward R. Tufte
- The Visual Display of Quantitative Information
- Visual Explanations

- Recommended, technical, big pictures:
- Ole E. Barndorff-Nielsen and David R. Cox, Inference and Asymptotics
- M. S. Bartlett
- Richard A. Berk, Regression Analysis: A Constructive Critique
- Leo Breiman, "Statistical Modeling: The Two Cultures",
Statistical
Science
**16**(2001): 199--231 [very much including the discussion by others and the reply by Breiman. Thanks to Chris Wiggins for alerting me to this.] - David R. Brillinger, "The 2005 Neyman Lecture: Dynamic Indeterminism in Science", Statistical Science
**23**(2008): 48--64, arxiv:0808.0620 [With discussions and response] - D. R. Cox and Christl A. Donnelly, Principles of Applied Statistics [Review: Turning Scientific Perplexity into Ordinary Statistical Uncertainty]
- Harald Cramér, Mathematical Methods of Statistics
- C. David Garson, Statnotes: An Online Textbook
- Peter Guttorp, Stochastic Modeling of Scientific Data [Good introduction to using dependent data]
- Trever Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction [Website, with full text free in PDF]
- Robert E. Kass, "Statistical Inference; The Big Picture",
Statistical Science
**26**(2011): 1--19, arxiv:1106.2895 - Tony Lin [Prof. Dr. Lin was working on his doctorate when I was an
undergrad at Berkeley; we became friends at the I-House, if that is the word I want for
someone who offered to keep my brain alive in a jigger-glass and subject it to
random electrical shocks ("Jzzt! Jzzt!"). But despite his questionable
tastes in acquaintances, he's a damn good statistician and a model teacher.]
- Virtual Statistics 50 [Intro. statistics]
- Virtual Statistics 154A [Intro. statistics with algebra and calculus]

- Deborah Mayo, Error and the Growth of Experimental Knowledge [Review: We Have Ways of Making You Talk, or, Long Live Peircism-Popperism-Neyman-Pearson Thought!]
- NIST, Electronic Handbook of Statistical Methods
- E. J. G. Pitman, Some Basic Theory for Statistical Inference
- Jorma Rissanen, Stochastic Complexity in Statistical Inquiry
- Mark Schervish, Theory of Statistics
- Galit Shmueli, "To Explain or to Predict?", Statistical
Science
**25**(2010): 289--310, arxiv:1101.0891 - Aad van der Vaart, Asymptotic Statistics
- Larry Wasserman
- All of Statistics
- All of Nonparametric Statistics

- Recommended, technical, close-ups:
- A. C. Atkinson and A. N. Donev, Optimum Experimental Design
- F. Bacchus, H. E. Kyburg and M. Thalos, "Against
Conditionalization," Synthese
**85**(1990): 475--506 [Why "Dutch book" arguments do not, in fact, mean that rational agents must be Bayesian reasoners] - Andrew Barron and Nicolas Hengartner, "Information theory and superefficiency", Annals of Statistics
**26**(1998): 1800--1825 - M. S. Bartlett, "The Statistical Significance of Odd Bits of
Information", Biometrika
**39**(1952): 228--237 [A goodness-of-fit test based on fluctuations of the entropy. JSTOR] - M. J. Bayarri and James O. Berger, "
*P*Values for Composite Null Models", Journal of the American Statistical Association**95**(2000) 127--1142 [To be read in conjunction with Robins, van der Vaart and Ventura, below. JSTOR] - Anil K. Bera and Aurobindo Ghosh, "Neyman's Smooth Test and Its Applications in Econometrics", pp. 177--230 in Aman Ullah, Alan T. K. Wan and Anoop Chaturvedi (eds.), Handbook of Applied Econometrics and Statistical Inference, SSRN/272888
- Julian Besag, "A Candidate's Formula: A Curious Result in
Bayesian Prediction", Biometrika
**76**(1989): 183 [A wonderful and bizarre expression for the Bayesian predictive density, in terms of how adding a new data point would change the posterior. JSTOR] - P. J. Bickel, "On Adaptive Estimation", Annals of Statistics
**10**(1982): 647--671 - Pier Bissiri, Chris Holmes, Stephen Walker, "A General Framework for Updating Belief Distributions", arxiv:1306.6430
- David Blackwell and M. A. Girshick, Theory of Games and Statistical Decisions
- Leo Breiman, "No Bayesians in Foxholes", IEEE Expert:
Intelligent Systems and Their Applications
**12**(1997): 21--24 [PDF reprint; comments by Andy Gelman] - Peter Bühlmann and Sara van de Geer, Statistics for High-Dimensional Data: Methods, Theory and Applications
- Ronald W. Butler, "Predictive Likelihood Inference with
Applications", Journal of the Royal Statistical Society
B
**48**(1986): 1--38 ["in the predictive setting, all parameters are nuisance parameters". JSTOR] - Venkat Chandrasekaran and Michael I. Jordan, "Computational and Statistical Tradeoffs via Convex Relaxation", Proceedings of the National Academy of Sciences (USA)
**110**(2013): E1181--E1190, arxiv:1211.1073 - Hwan-sik Choi and Nicholas M. Kiefer, "Differential Geometry and Bias Correction in Nonnested Hypothesis Testing" [PDF preprint via Kiefer]
- A. C. Davison and D. V. Hinkley, Bootstrap Methods and their Applications
- Aurore Delaigle and Peter Hall, "Higher Criticism in the Context of Unknown Distribution, Non-Independence and Classification", pp. 109--138 of Sastry, Rao, Delampady and Rajeev (eds.), Platinum Jubilee Proceedings of the Indian Statistical Institute [PDF reprint via Prof. Delaigle]
- J. Bradford DeLong and Kevin Lang, "Are All Economic Hypotheses
False?", Journal of Political Economy
**100**(1992): 1257--1272 [PDF preprint. The point is about abuses of hypothesis testing, not economic hypotheses as such.] - Amir Dembo and Yuval Peres, "A Topological Criterion for Hypothesis Testing", Annals of Statistics
**22**(1994): 106--117 ["A simple topological criterion is given for the existence of a sequence of tests for composite hypothesis testing problems, such that almost surely only finitely many errors are made."] - David Donoho and Jiashun Jin, "Higher criticism for detecting sparse heterogeneous mixtures", Annals of Statistics
**32**(2004); 962--994, arxiv:math.ST/0410072 - Earman, Bayes or Bust? A Critical Account of Bayesian Confirmation Theory
- Bradley Efron
- "Bootstrap Methods: Another Look at the Jackknife",
Annals of
Statistics
**7**(1979): 1--26 [The original paper; staggeringly understandable] - The Jackknife, the Bootstrap, and Other Resampling Plans
- "Maximum Likelihood and Decision Theory",
The Annals of Statistics
**10**(1982): 340--356

- "Bootstrap Methods: Another Look at the Jackknife",
Annals of
Statistics
- Mikhail Ermakov, "On Consistent Hypothesis Testing", arxiv:1403.6296
- Michael Evans, "What does the proof of Birnbaum's theorem prove?", arxiv:1302.5468
- S. N. Evans and P. B. Stark, "Inverse Problems as Statistics" [Abstract, PDF]
- Steve Fienberg, The Analysis of Cross-Classified Categorical Data
- Don Fraser, "Is Bayes posterior just quick and dirty confidence",
Statistical Science
**26**(2011): 299--316, arxiv:1112.5582 [See also the discussions by others, and Fraser's reply. My answer to the question posed in Fraser's title is "yes", or rather "YES!"] - Andrew Gelman, Jennifer Hill and Masanao Yajima, "Why we (usually) don't have to worry about multiple comparisons" [PDF preprint]
- Andrew Gelman and Iain Pardoe, "Average predictive comparisons
for models with nonlinearity, interactions, and variance components",
Sociological Methodology
**37**(2007): 23--51 [PDF preprint, Gelman's comments] - Christopher Genovese, Peter Freeman, Larry Wasserman, Robert C. Nichol and Christopher Miller, "Inference for the Dark Energy Equation of State
Using Type IA Supernova Data", Annals of Applied Statistics
**3**(2009): 144--178, arxiv:0805.4136 [I am biased, because Genovese and Wasserman are friends, but this seems to me a model of a modern applied statistics paper: use interesting statistical ideas to say something helpful about an important scientific problem*on its own terms*, rather than distorting the problem until it "looks like a nail".] - Christopher Genovese and Larry Wasserman, "Adaptive Confidence
Bands", Annals of Statistics
**36**(2008): 875--905, math.ST/0701513 - Charles J. Geyer, "Le Cam Made Simple: Asymptotics of Maximum
Likelihood without the LLN or CLT or Sample Size Going to
Infinity", arxiv:1206.4762 [There
are two separable points here. One is that much of the usual asymptotic theory
of maximum likelihood follows from the quadratic form of the
likelihood
*alone*; whenever and however that is reached, those consequences follow. Approximately quadratic likelihoods imply approximations to the usual asymptotics. This is unquestionably correct. The other is some bashing of results like the law of large numbers and central limit theorem, which seems misguided to me.] - Tilmann Gneiting, "Making and Evaluating Point Forecasts",
Journal of the American Statistical Association
**106**(2011): 746--762, arxiv:0912.0902 - Tilmann Gneting, Fadoua Balabdaoui and Adrian E. Raftery, "
Probabilistic Forecasts, Calibration and Sharpness", Journal
of the Royal Statistical Society B
**69**(2007): 243--268 - Trygve Haavelmo, "The Probability Approach in Econometrics",
Econometrica
**12**(1944, supplement): iii--115 [JSTOR] - Mark S. Handcock and Martina Morris, Relative Distribution Methods in the Social Sciences
- Bruce E. Hansen
- "The Likelihood Ratio Test Under Nonstandard
Conditions: Testing the Markov Switching Model of GNP", Journal of
Applied Econometrics
**7**(1992): S61--S82 [I very much like the approach of treating the likelihood ratio as an empirical process; why haven't I seen it before? (Also, the state-of-the-art in simulating Gaussian processes must be much better now than what Hansen had in '92, which would make this even more practical. PDF reprint.] - "Inference when a nuisance parameter is not identified under the null hypothesis", Econometrica
**64**(1996): 413--430

- "The Likelihood Ratio Test Under Nonstandard
Conditions: Testing the Markov Switching Model of GNP", Journal of
Applied Econometrics
- Jeffrey D. Hart, Nonparametric Smoothing and Lack-of-Fit Tests
- Kieran Healy, Data Visualization: A Practical Introduction
- Nils Lid Hjort and David Pollard, "Asymptotics for minimisers of convex processes", arxiv:1107.3806 [Very elegant]
- Peter J. Huber
- "On the Non-Optimality of Optimal Procedures"
- "The behavior of maximum likelihood estimates under nonstandard conditions", Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1 (Univ. of Calif. Press, 1967), pp. 221-233

- Wilbert C. M. Kallenberg and Teresa Ledwina, "Data-driven
smooth tests when the hypothesis is composite", Journal of the
American Statistical Association
**92**(1997): 1094--1104 [Abstract, PDF reprint; JSTOR] - Gary King, A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data
- Gary King and Margaret Roberts, "How Robust Standard Errors Expose Methodological Problems They Do Not Fix" [PDF preprint]
- Solomon W. Kullback, Information Theory and Statistics
- Michael Lavine and Mark J. Schervish, "Bayes Factors: What They Are and What They Are Not" [PS preprint]
- Steffen Lauritzen, Extremal Families and Systems of Sufficient Statistics [See comments under sufficient statistics]
- J. F. Lawless and Marc Fredette, "Frequentist prediction intervals
and predictive distributions", Biometrika
**92**(2005): 529--542 ["Frequentist predictive distributions are defined as confidence distributions .... A simple pivotal-based approach that produces prediction intervals and predictive distributions with well-calibrated frequentist probability interpretations is introduced, and efficient simulation methods for producing predictive distributions are considered. Properties related to an average Kullback-Leibler measure of goodness for predictive or estimated distributions are given."] - Lucien Le Cam
- Erich L. Lehmann, "On likelihood ratio tests", math.ST/0610835
- Jin Lei, Alessandro Rinaldo, and Larry Wasserman, "A Conformal Prediction Approach to Explore Functional Data", arxiv:1302.6452
- Jing Lei, James Robins, and Larry Wasserman, "Efficient Nonparametric Conformal Prediction Regions", arxiv:1111.1418
- Jing Lei and Larry Wasserman, "Distribution Free Prediction Bands", arxiv:1203.5422
- Bing Li, "A minimax
approach to consistency and efficiency for estimating equations," Annals
of Statistics
**24**(1996): 1283--1297 - Bruce Lindsay and Liawei Liu, "Model Assessment Tools for a Model
False
World", Statistical
Science
**24**(2009): 303--318, arxiv:1010.0304 [Their model-adequacy index is, essentially, the number of samples needed to detect the falsity of the model with some reasonable, pre-set level of power, with fixed size/significance level. This is a very natural quantity. In fact, by results which go back to Kullback's book, the power grows exponentially, with a rate equal to the Kullback-Leibler divergence rate. (More exactly, one minus the power goes to zero exponentially at that rate, but you know what I meant.) Large deviations theory includes generalizations of this result. Many statisticians, I'd guess, would prefer the Lindsay-Liu index because will feel it more natural to them to gauge error in terms of a sample size rather than bits, but to each their own.] - Brad Luen and Philip B. Stark, "Testing earthquake predictions", pp. 302--315 in Deborah Nolan and Terry Speed (eds.), Probability and Statistics; Essays in Honor of David A. Freedman [The issues arise however not just for earthquakes, but for all sorts of clustered events]
- Charles Manski, Identification for Prediction and Decision
- Deborah G. Mayo and D. R. Cox, "Frequentist statistics as a theory of inductive inference", math.ST/0610846
- Neri Merhav, "Bounds on Achievable Convergence Rates of Parameter Estimators via Universal Coding", IEEE Transactions on Information Theory
**40**(1994): 1210--1215 [PDF reprint via Prof. Merhav. Thanks to Max Raginsky for pointing this out.] - Karthika Mohan, Judea Pearl and Jin Tian, "Graphical Models for Inference with Missing Data", NIPS 2013 [There was at least one preprint version with the more pointed title "Missing Data as a Causal Inference Problem"]
- M. B. Nevel'son and R. Z. Has'minskii, Stochastic Approximation and Recursive Estimation
- Jerzy Neyman, "On the Two Different Aspects of the Representative
Method: The Method of Stratified Sampling and the Method of Purposive
Selection",
Journal of the Royal Statistical Society
**97**(1934): 558--625 [This is an astonishing paper on multiple levels. One is the thoroughness with which it achieves its main objective, of demonstrating the superiority of random sampling over alternatives. Another is that it seems to be the first conscious use of confidence intervals. Yet another is the way it set the pattern for a huge fraction of all subsequent statistics down to the present.] - Andrey Novikov, "Optimal sequential multiple hypothesis tests", arxiv:0811.1297
- David Pollard
- "Asymptotics via Empirical Processes",
Statistical Science
**4**(1989): 341--354 - Empirical Processes: Theory and Applications

- "Asymptotics via Empirical Processes",
Statistical Science
- Jeffrey S. Racine, "Nonparametric Econometrics: A Primer",
Foundations and Trends in Econometrics
**3**(2008): 1--88 [Good primer of nonparametric techniques for regression, density estimation and hypothesis testing; next to no economic content (except for examples). Presumes reasonable familiarity with parametric statistics. PDF reprint] - J. N. K. Rao, "Some recent advances in model-based small area estimation", Survey Methodology
**25**(1999): 175--186 - James M. Robins and Ya'acov Ritov, "Toward a curse of Dimensionality
Appropriate (CODA) Asymptotic Theory for Semi-Parametric Models",
Statistics in Medicine
**16**(1997): 285--319 [PDF reprint via Prof. Robins] - James M. Robins, Aad van de Vaart and Valérie Ventura,
"Asymptotic Distribution of
*P*Values in Composite Null Models", Journal of the American Statistical Association**95**(2000): 1143--1156 [JSTOR. Paired article with Bayarri and Berger, above. The discussions and rejoinders (pp. 1157--1172) are valuable.] - George G. Roussas, Contiguity of Probability Measures: Some Applications in Statistics
- C. Scott and R. Nowak, "A Neyman-Pearson Approach to Statistical
Learning", IEEE
Transactions on Information Theory
**51**(2005): 3806--3819 [Comments: Learning Your Way to Maximum Power] - Steven G. Self and Kung-Yee Liang, "Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests Under Nonstandard Conditions",
Journal of the American Statistical Association
**82**(1987): 605--610 [JSTOR] - Tom Shively, Stephen Walker, "On the Equivalence between Bayesian and Classical Hypothesis Testing", arxiv:1312.0302
- Jeffrey S. Simonoff, Smoothing Methods in Statistics
- Spyros Skouras, "Decisionmetrics: Towards a Decision-Based
Approach to Econometrics," SFI
Working Paper 2001-11-064 [Applies far outside econometrics. If what you
really want to do is to minimize a
*known*loss function, optimizing a conventional accuracy measure, e.g. least squares, can be highly counterproductive.] - Aris Spanos
- "The Curve-Fitting Problem, Akaike-type Model
Selection, and the Error Statistical Approach" [Or: could
*your*model selection tell you that Kepler is better than Ptolemy? Technical report, economics dept., Virginia Tech, 2006. PDF] - "Where do statistical models come from? Revisiting the problem of specification", math.ST/0610849

- "The Curve-Fitting Problem, Akaike-type Model
Selection, and the Error Statistical Approach" [Or: could
- Yun Ju Sung, Charles J. Geyer, "Monte Carlo likelihood inference for missing data models", Annals of Statistics
**35**(2007): 990--1011, arxiv:0708.2184 - Alexandre B. Tsybakov, Introduction to Nonparametric Estimation
- Sara van de Geer, Empirical Process Theory in M-Estimation [Finding non-asymptotic rates of convergence for common estimators]
- Aad van der Vaart, "The Statistical Work of Lucien Le Cam",
Annals of Statistics
**30**(2002): 631--682 - Quang H. Vuong, "Likelihood Ratio Tests for Model Selection and
Non-Nested Hypotheses", Econometrica
**57**(1989): 307--333 - Grace Wahba, Spline Models for Observational Data
- Abraham Wald, "Estimation of a Parameter When the Number of Unknown Parameters Increases Indefinitely with the Number of Observations",
Annals of Mathematical Statistics
**19**(1948): 220--227 - Michael E. Wall, Andreas Rechtsteiner and Luis M. Rocha, "Singular Value Decomposition and Principal Component Analysis," physics/0208101
- Michael D. Ward, Brian D. Greenhill and Kristin M. Bakke, "The perils of policy by p-value: Predicting civil conflicts", Journal of Peace Research
**47**(2010): 363--375 - Larry Wasserman, "Low Assumptions, High Dimensions", RMM
**2**(2011): 201--209 - Halbert White, Estimation, Inference and Specification Analysis
- Achilleas Zapranis and Apostolos-Paul Refenes, Principles of Neural Model Identification, Selection and Adequacy, with Applications to Financial Econometrics
- Sven Zenker, Jonathan Rubin, Gilles Clermont, "From Inverse Problems in Mathematical Physiology to Quantitative Differential Diagnoses", PLoS Computational Biology
**3**(2007): e205 - Johanna F. Ziegel and Tilmann Gneiting, "Copula Calibration", arxiv:1307.7650

- Recommended, historical:
- Nicola Giocoli, "From Wald to Savage: homo economicus becomes a Bayesian statistician" [preprint]
- Erich L. Lehmann, "On the history and use of some standard statistical models", pp. 114--126 in Deborah Nolan and Terry Speed (eds.), Probability and Statistics: Essays in Honor of David A. Freedman
- Henry Scheffe, "Statistical Inference in the Non-Parametric Case", Annals of Mathematical Statistics
**14**(1943): 305--332 [Recommended not as a historical study, but a historical document] - Stephen M. Stigler, "The Epic Story of Maximum Likelihood",
Statistical Science
**22**(2007): 598--620, arxiv:0804.2996

- Modesty forbids me to recommend:
- Andrew Gelman and CRS, "Philosophy and the practice of Bayesian statistics", submitted to the Journal of the American Statistical Association, arxiv:1006.3868
- CRS

- Not unambiguously recommended:
- Peter J. Diggle and Amanda G. Chetwynd, Statistics and Scientific Method: An Introduction for Students and Researchers [A missed opportunity.]
- Peter McCullagh, "What is a statistical
model?", Annals
of Statistics
**30**(2002): 1225--1310 [I'm not sure what to think about this; some of the ideas about requiring invariance (or equivariance) under transformations make sense, but I don't know that they lead to anything positive, or need such arcane category-theoretic expression. We should however have cited this in our paper on projectibility and consistency under sampling. (I blame our referees for not making the connection.) — The discussion and rejoinder are worth reading. Kalman's contribution is very special.]

- To read, textbooks, reviews, etc.:
- R. Harald Baayen, Analyzing Linguistic Data: A Practical Introduction to Statistics Using R
- Vic Barnett, Comparative Statistical Inference
- Bucklew, Large Deviation Techniques in Decision, Simulation, and Estimation
- A. C. Davison, Statistical Models
- George Estabrook, Computational Approach to Statistical Arguments in Ecology and Evolution
- William W. Hsieh, Machine Learning Methods in the Environmental Sciences: Neural Networks and Kernels
- Alan Julian Izenman, Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning
- Erich L. Lehmann, Elements of Large-Sample Theory
- National Institute of Standards and Technology, Engineering Statistics Handbook [All the sections I've looked at have been quite good.]
- Yudi Pawitan, In All Likelihood: Statistical Modeling and Inference Using Likelihood
- Aris Spanos, Probability Theory and Statistical Interference: Econometric Modeling with Observational Data

- To read, history and philosophy:
- Odd O. Aalen, Per Kragh Andersen, \Ornulf Borgan, Richard D. Gill, Niels Keiding, "History of applications of martingales in survival analysis",
Electronic Journal for History of Probability and Statistics
**5**(2009), arxiv:1003.0188 - Carolina Armenteros, "From Human Nature to Normal Humanity: Joseph de Maistre, Rousseau, and the Origins of Moral Statistics", Journal of the
History of Ideas
**68**(2007): 107--130 [Abstract, text links] - Dan Bouk, How Our Days Became Numbered: Risk and the Rise of the Statistical Individual
- Phaedra Daipha, Masters of Uncertainty: Weather Forecasters and the Quest for Ground Truth
- Ian Hacking, "The Theory of Probable Inference: Neyman, Peirce and Braithwaite," in Science, Belief and Behavior: Essays in Honor of R. B. Braithwaite ed. D. H. Mellor
- Anders Hald, A History of Parametric Statistical Inference from Bernoulli to Fisher, 1713--1935
- Orit Halpern, Beautiful Data: A History of Vision and Reason Since 1945
- Kendall and Plackett (eds.), Studies in the History of Statistics and Probability
- Kyburg, Uncertain Inference
- Erich Lehmann and Juliet Shadder, Fisher, Neyman, and the Creation of Classical Statistics
- Mayo and Hollander (eds.), Acceptable Evidence: Science and Values in Risk Management
- Leland Gerson Neuberg, Conceptual Anomalies in Economics and Statistics: Lessons from the Social Experiment
- Theodore Porter, Trust in Numbers
- Nico Randeraad, States and Statistics in the Nineteenth Century: Europe by Numbers
- Thomas A. Stapleford, The Cost of Living in America: A Political History of Economic Statistics, 1880--2000
- Stephen M. Stigler
- The History of Statistics: The Measure of Uncertainty before 1900
- Statistics on the Table: The History of Statistical Concepts and Methods

- Gregory D. Wilson, Articulation Theory and Disciplinary Change: Unpacking the Bayesian-Frequentist Paradigm Conflict in Statistical Science [Ph.D. thesis, New Mexico State University, 2001]
- S. L. Zabell, Symmetry and Its Discontents: Essays on the History of Inductive Probability

- To read, research literature:
- Felix Abramovich, Yoav Benjamini, David L. Donoho and Iain M. Johnstone, "Adapting to Unknown Sparsity by controlling the False Discovery Rate", math.ST/0505374 [I don't really care about sparsity, but they promise novel relations between the FDR control and asymptotic minimaxity and complexity-penalized model selection.]
- Gianfranco Adimari and Annamaria Guolo, "A note on the asymptotic behaviour of empirical likelihood statistics", Statistical Methods and Applications
**19**(2010): 463--476 - Stéphanie Allassonniere, Estelle Kuhn, "Convergent Stochastic Expectation Maximization algorithm with efficient sampling in high dimension. Application to deformable template model estimation", arxiv:1207.5938
- Elizabeth S. Allman, Catherine Matias, John A. Rhodes, "Identifiability of parameters in latent structure models with many observed variables",
Annals of Statistics
**37**(2009): 3099--3132, arxiv:0809.5032 - Nabil I. Al-Najjar, Alvaro Sandroni, Rann Smorodinsky
and Jonathan Weinstein, "Testing Theories with Learnable and Predictive Representations", Journal of Economic Theory
**145**(2010): 220-3--2217 - Miguel A. Arcones, "Bahadur Efficiency of the Likelihood Ratio Test" [PDF preprint from 2005, presumably since published...]
- Barry C. Arnold et al., Conditional Specification of Statistical Models
- R. A. Bailey, Design of Comparative Experiments
- Sivaraman Balakrishnan, Alessandro Rinaldo, Don Sheehy, Aarti Singh, Larry Wasserman, "Minimax Rates for Homology Inference", arxiv:1112.5627
- Roger Barlow, "Asymmetric Errors", physics/0401042
- Ole E. Barndorff-Nielsen and David R. Cox, "Prediction and
Asymptotics", Bernoulli
**2**(1996): 319--340 - Ole E. Barndorff-Nielsen, David R. Cox and Claudia Klüppelberg (eds.), Complex Stochastic Systems
- M. J. Bayarri and M. E. Castellanos, "Bayesian Checking of the Second
Levels of Hierarchical Models", Statistical Science
**22**(2007): 322--343 - Zvika Ben-Haim and Yonina C. Eldar, "The Cramer-Rao Bound for Sparse Estimation", arxiv:0905.4378
- Yoav Benjamini, Marina Bogomolov, "Adjusting for selection bias in testing multiple families of hypotheses", arxiv:1106.3670
- Yoav Benjamini, Vered Madar and Phillip B. Stark, "Simultaneous confidence intervals uniformly more likely to determine signs", Biometrika
**100**(2013): 283--300 - David R. Bickel
- "The Strength of Statistical Evidence for Composite Hypotheses: Inference to the Best Explanation" [Preprint]
- "Resolving conflicts between statistical methods by probability combination: Application to empirical Bayes analyses of genomic data", arxiv:1111.6174
- "A prior-free framework of coherent inference and its derivation of simple shrinkage estimators" [preprint]

- Peter J. Bickel, C. A. J. Klaassen, Y. Ritov and J. A. Wellner, Efficient and Adaptive Estimation for Semiparametric Models
- Peter J. Bickel and Bo Li, "Regularization in Statistics",
Test
**15**(2006): 271--344 [PDF reprint] - Peter J. Bickel and Y. Ritov, "Non-Parametric Estimators Which Can Be `Plugged-In' " UCB Stat. Tech. Rep. 602 [abstract, pdf]
- Lucien Birgé
- "A New Lower Bound for Multiple Hypothesis Testing",
IEEE Transactions on
Information Theory
**51**(2005): 1611--1615 - "About the non-asymptotic behaviour of Bayes estimators", arxiv:1402.3695

- "A New Lower Bound for Multiple Hypothesis Testing",
IEEE Transactions on
Information Theory
- Gilles Blanchard, Sylvain Delattre, Etienne Roquain , "Testing over a continuum of null hypotheses", arxiv:1110.3599
- Michael Blum, "Approximate Bayesian Computation: a non-parametric perspective", arxiv:0904.0635
- Ingwer Borg and Patrick J. F. Groenen, Modern Multidimensional Scaling: Theory and Application
- A. R. Brazzale and A. C. Davison, "Accurate Parametric Inference for Small Samples", Statistical Science
**23**(2008): 465--484 [Apparently, a preview for the book.] - A. R. Brazzale, A. C. Davison and N. Reid, Applied Asymptotics: Case Studies in Small-Sample Statistics
- Trevor S. Breusch, "Hypothesis Testing in Unidentified Models",
Review of Economic Studies
**53**(1986): 635--651 [JSTOR] - Adam D. Bull, "Honest adaptive confidence bands and self-similar functions", Electronic Journal of Statistics
**6**(2012): 1490--1516, arxiv:1110.4985 - Adam D. Bull, Richard Nickl, "Adaptive confidence sets in $L^2$",
Probability Theory and Related Fields
**156**(2013): 889--919 - Florentina Bunea, Alexandre B. Tsybakov, Marten H. Wegkamp, Adrian Barbu, "Spades and Mixture Models", Annals of Statistics
**38**(2010): 2525--2558, arxiv:0901.2044 - Dizza Bursztyn and david M. Steinberg, "Comparison of designs for
computer experiments", Journal
of Statistical Planning and Inference
**136**(2006): 1103--1119 - T. Tony Cai, "Minimax and Adaptive Inference in Nonparametric Function Estimation", Statistical Science
**27**(2012): 31--50 - T. Tony Cai and Mark G. Low, "An adaptation theory for
nonparametric confidence intervals", Annals of
Statistics
**32**(2004): 1805--1840, math.ST/0503662 - Emmanuel Candes and Terence Tao, "Near Optimal Signal Recovery from Random Projections and Universal Encoding Strategies", math.CA/0410542
- Hervé Cardot, David Degras, Etienne Josserand, "Confidence bands for Horvitz-Thompson estimators using sampled noisy functional data", Bernoulli
**19**(2013): 2067--2097, arxiv:1105.2135 - Hervé Cardot, Andre Mas and Pascal Sarda, "CLT in Functional Linear Regression Models", math.ST/0508073
- Kamalika Chaudhuri and Daniel Hsu, "Convergence Rates for Differentially Private Statistical Estimation", arxiv:1206.6395
- Djalil Chafai and Didier Concordet, "On the strong consistency of approximated M-estimators", math.ST/0507102
- In Hong Chang and Rahul Mukerjee, "Asymptotic results on the
frequentist mean squared error of generalized Bayes point predictors", Statistics and
Probability Letters
**67**(2004): 65--71 [Note to self: file this one under "de-Bayesing".] - Sandra Chapman, George Rowlands and Nicholas Watkins
- "Extremum statistics: A framework for data analysis," cond-mat/0106015
- "Extremum Statistics and Signatures of Long Range Correlations," cond-mat/0106015
- "The relationship between extremum statistics and universal fluctuations," cond-mat/0007275

- Xiaohong Chen, Markus Reiss, "On rate optimality for ill-posed inverse problems in econometrics", arxiv:0709.2003 [Non-parametric instrumental variables?]
- Xinjia Chen, "Sequential Tests of Statistical Hypotheses with Confidence Limits", arxiv:1007.4278
- N. N. Chentsov, Statistical Decision Rules and Optimal Inference ,
- Victor Chernozhukov, Denis Chetverikov, and Kengo Kato, "Anti-concentration and honest, adaptive confidence bands", Annals of Statistics
**42**(2014): 1787--1818, arxiv:1303.7152 - Zhiyi Chi, "Effects of statistical dependence on multiple testing under a hidden Markov model", Annals of Statistics
**39**(2011): 439--473 - Christine Choirat and Raffaello Seri, "Estimation in Discrete Parameter Models", Statistical Science
**27**(2012): 278--293 - Bertrand Clarke, "Desiderata for a Predictive Theory of Statistics",
Bayesian Analysis
**5**(2010): 1--36 - Sandy Clarke, Peter Hall, "Robustness of multiple testing procedures against dependence", Annals of Statistics
**37**(2009): 332--358, arxiv:0903.0464 - Arthur Cohen and Harold B. Sackrowitz, "Decision theory results for
one-sided multiple comparison procedures", Annals of
Statistics
**33**(2005): 126--144, math.ST/0504505 - Arthur Cohen, Harold B. Sackrowitz, Minya Xu, "A new multiple
testing method in the dependent
case", Annals of Statistics
**37**(2009) 1518--1544, arxiv:0906.3082 - John Copas and Shinto Eguchi, "Likelihood for statistically equivalent models", Journal of the Royal Statistical Society B MStrong>72 (2010): 193--217
- Daniel Commenges, "Statistical models: Conventional, penalized and hierarchical likelihood", Statistics Surveys
**3**(2009): 1--17, arxiv:0808.4042 - Daniel Commenges, Helene Jacqmin-Gadda, Cecile Proust, and Jeremie Guedj, "A Newton-Like Algorithm for Likelihood Maximization: The Robust-Variance Scoring Algorithm", math.ST/0610402
- Cox and Wermuth, Multivariate Dependencies: Models, Analysis and Interpretation
- Anirban DasGupta, Asymptotic Theory of Statistics and Probability
- Alexandre d'Aspremont, Onureena Banerjee, Laurent El Ghaoui, "First-order methods for sparse covariance selection", math.OC/0609812
- I. Dattner, A. Goldenshluger, A. Juditsky, "On deconvolution of distribution functions", arxiv:1006.3918 ["nonparametric estimation of a continuous distribution function from observations with measurement errors... rate optimal estimators based on direct inversion of empirical characteristic function"]
- P. L. Davies
- "Data Features", Statistica Neerlandica
**49**(1995): 185--245 - "Approximating Data", Journal of the Korean Statistical
Society
**37**(2008): 191--211 [With discussion and rejoinder. Open access?]

- "Data Features", Statistica Neerlandica
- P. L. Davies, A. Kovac and M. Meise, "Nonparametric Regression, Confidence Regions and Regularization", arxiv:0711.0690
- A. Philip Dawid, Steven de Rooij, Glenn Shafer, Alexander Shen, Nikolai Vereshchagin, Vladimir Vovk, "Martingales and p-values as measures of evidence", arxiv:0912.4269
- Pierpaolo De Blasi and Stephen G. Walker, "Bayesian Estimation of the Discrepancy with Misspecified Parametric Models", Bayesian Analysis
**8**(2013): 781--800 - Aurore Delaigle, Peter Hall and Jiashun Jin, "Robustness and accuracy of methods for high dimensional data analysis based on Student's t-statistic",
Journal of the Royal Statistical Society B
**forthcoming**(2011) - Joshua V Dillon, Guy Lebanon, "Stochastic Composite Likelihood", Journal
of Machine Learning Research
**11**(2010): 2597--2633, apparently the final version of arxiv:1003.0691 - David L. Donoho, "Estimation by epsilon-nets" (Le Cam Lecture, 2003; find citation)
- David L. Donoho and Richard C. Liu, "The ``Automatic'' Robustness
of Minimum Distance
Functionals", Annals
of Statistics
**16**(1988): 552--586 - David L. Donoho and Jared Tanner, "Observed Universality of Phase Transitions in High-Dimensional Geometry, with Implications for Modern Data Analysis and Signal Processing", arxiv:0906.2530
- Mathias Drton, "Likelihood ratio tests and singularities",
Annals of Statistics
**37**(2009): 979--1012, arxiv:math.ST/0703360 - Mathias Drton and Seth Sullivant, "Algebraic statistical models", math.ST/0703609
- Jin-Chuan Duan and Andras Fulop, "A stable estimator of the
information matrix under EM for dependent data", Statistics and Computing
**21**(2011): 83--91 - John C. Duchi, Michael I. Jordan, Martin J. Wainwright, "Local Privacy and Statistical Minimax Rates", arxiv:1302.3203
- Lutz Duembgen, Jon A. Wellner, "Confidence Bands for Distribution Functions: A New Look at the Law of the Iterated Logarithm", arxiv:1402.2918
- Morris L. Eaton, Multivariate Statistics: A Vector Space Approach ["a version of multivariate statistical theory in which vector space and invariance methods replace, to a large extent, more traditional multivariate methods"]
- Sam Efromovich
- "Distribution estimation for biased data",
Journal of
Statistical Planning and Inference
**124**(2004): 1--43 - "Dimension Reduction and Adaptation in Conditional Density
Estimation", Journal
of the American Statistical Association
**105**(2010): 761--774 - Nonparametric Curve Estimation

- "Distribution estimation for biased data",
Journal of
Statistical Planning and Inference
- Bradley Efron, "Size, power and false discovery rates",
Annals of Statistics
**35**(2007): 1351--1377, arxiv:0710.2245 - Bradley Efron and Trevor Hastie, Computer Age Statistical Inference: Algorithms, Evidence, and Data Science
- Werner Ehm, Jürgen Kornmeier, and Sven P. Heinrich, "Multiple testing along a tree", Electronic Journal of Statistics
**4**(2010): 461--471 =? arxiv:0902.2296 - Thibault Espinasse, Paul Rochet, "A Cramér-Rao inequality for non differentiable models", arxiv:1204.2763
- Michael Evans and Gun Ho Jang, "Invariant P-values for model checking", Annals of Statistics
**38**(2010): 512--525 - Jianqing Fan and Jian Zhang, "Sieve empirical likelihood ratio
tests for nonparametric functions", Annals of
Statistics
**32**(2004): 1858--1907, math.ST/0503667 - Stefano Favaro, Antonio Lijoi, and Igor Prünster, "Asymptotics for a Bayesian nonparametric estimator of species variety", Bernoulli
**18**(2012): 1267--1283 - Thomas S. Ferguson, A Course in Large Sample Theory
- Jean-David Fermanian and Bernard Salanié "A Nonparametric
Simulated Maximum Likelihood Estimation Method", Econometric
Theory
**20**(2004): 701--734 - Ana K. Fermin and Carenne Ludena, "A Statistical view of Iterative Methods for Linear Inverse Problems", math.ST/0504064
- Luisa Turrin Fernholz, von Mises Calculus for Statistical Functionals
- S. E. Fienberg, P. Hersh, A. Rinaldo and Y. Zhou, "Maximum Likelihood Estimation in Latent Class Models For Contingency Table Data", arxiv:0709.3535
- D. A. S. Fraser, N. Reid, E. Marras and G. Y. Yi, "Default priors for Bayesian and frequentist inference", Journal of the
Royal Statistical Society B
**72**(2010): 631--654 - A. Fraysse, "Why minimax is not that pessimistic", arxiv:0902.3311 [Because, apparently, learning a generic function is just as hard as minimax leads you to think. Bummer if true.]
- Magalie Fromont and Béatrice Laurent, "Adaptive
goodness-of-fit tests in a density model", Annals of
Statistics
**34**(2006): 680--720, math.ST/0607013 - Axel Gandy and Patrick Rubin-Delanchy, "An algorithm to compute the power of Monte Carlo tests with guaranteed precision", Annals of Statistics
**41**(2013): 125--142, arxiv:1110.1248 - Surya Ganguli and Haim Sompolinsky, "Statistical Mechanics of
Compressed
Sensing", Physical
Review Letters
**104**(2010): 188701 - Seymour Geisser, Predictive Inference
- Christopher R. Genovese and Larry Wasserman, "Confidence sets for
nonparametric wavelet regression", Annals of
Statistics
**33**(2005): 698--729, math.ST/0505632 - Evarist Giné and Richard Nickl, Mathematical Foundations of Infinite-Dimensional Statistical Models
- Josep Ginebra, "On the Measure of the Information in a Statistical
Experiment", Bayesian
Analysis
(2007): 167--212 - Tilman Gneiting, Fadoua Balabdaoui, and Adrian E. Raftery,
"Probabilistic forecasts, calibration and sharpness", Journal
of the Royal Statistical Society B
**69**(2007): 243--268 - Tilmann Gneiting and Roopesh Ranjan, "Combining predictive distributions", Electronic Journal of Statistics
**7**(2013): 1747--1782 - Yuri Golubev, Vladimir Spokoiny, "Exponential bounds for minimum contrast estimators", arxiv:0901.0655
- Grassberger and Nadal (eds.), From Statistical Physics to Statistical Inference and Back
- Ulf Grenander, Abstract Inference
- Peter Guttorp, "Statistics and Climate",
Annual Review of Statistics and
Its Applications
**1**(2014): 87--101 - Robert Hable, "Asymptotic Normality of Support Vector Machines for Classification and Regression", arxiv:1010.0535
- Peter Hall, Hans-Georg Müller, Fang Yao, "Estimation of functional derivatives", Annals of Statistics
**37**(2009): 3307--3329, arxiv:0909.1157 - Marc Hallin, Davy Paindaveine, and Miroslav Siman, "Multivariate
quantiles and multiple-output regression quantiles: From L1 optimization to
halfspace
depth", Annals
of Statistics
**38**(2010): 635--669 - Bruce E. Hansen, "Interval Forecasts and Parameter Uncertainty",
Journal of Econometrics
**135**(2006): 377--398 [Preprint] - Wolfgang Härdle, Marlene Müller, Stefan Sperlich and Axel Werwatz, Nonparametric and Semiparametric Models: An Introduction
- Matthew T. Harrison, "Valid p-Values using Importance Sampling", arxiv:104.2910
- David F. Hendry and Jurgen A. Doornik, Empirical Model Discovery and Theory Evaluation: Automatic Selection Methods in Econometrics
- Heng Lian, "Empirical Likelihood Confidence Intervals for Nonparametric Functional Data Analysis", arxiv:0904.0843
- David A. Hensher et al. Applied Choice Analysis: A Primer ["Application of quantitative statistical methods to study choices made by individuals"]
- Tim Hesterberg, Nam Hee Choi, Lukas Meier, Chris Fraley, "Least angle and $\ell_1$ penalized regression: A review", Statistics Surveys
**2**(2008): 61--93, arxiv:0802.0964 - David Hinkley, "Predictive Likelihood", Annals of Statistics
**7**(1979): 718--728 - Nils Lid Hjort, Ian W. McKeague, Ingrid Van Keilegom, "Extending
the scope of empirical likelihood", Annals of
Statistics
**37**(2009): 1079--1111, arxiv:0904.2949 - Peter D. Hoff, "A hierarchical eigenmodel for pooled covariance estimation", Journal of the Royal
Statistical Society B
**71**(2009): 971--992 - Peter Hoff, Jon Wakefield, "Bayesian sandwich posteriors for pseudo-true parameters", arxiv:1211.0087
- Marc Hoffmann and Richard Nickl, "On adaptive inference and confidence bands", Annals of Statistics
**39**(2011): 2383--2409 - Torsten Hothorn, Thomas Kneib, Peter Bühlmann, "Conditional transformation models", Journal of the Royal Statistical Society B
**forthcoming** - Joel L. Horowitz, Semiparametric and Nonparametric Methods in Econometrics
- Serkan Hosten, Amit Khetan and Bernd Sturmfels, "Solving the Likelihood Equations", math.ST/0408270
- Ping-Hung Hsieh, "A nonparametric assessment of model adequacy based on Kullback-Leibler divergence", Statistics and Computing
**23**(2013): 149--162 - Dayu Huang, Sean Meyn, "Generalized Error Exponents For Small Sample Universal Hypothesis Testing",
IEEE Transactions on Information Theory
**59**(2013): 8157--8181, arxiv:1204.1563 - Mia Hubert, Peter J. Rousseeuw, Stefan Van Aelst, "High-Breakdown Robust Multivariate Methods", Statistical Science
**23**(2008): 92--119, arxiv:0808.0657 - Alexander Ilin, Tapani Raiko, "Practical Approaches to Principal Component Analysis in the Presence of Missing Values", Journal
of Machine Learning Research
**11**(2010): 1957--2000 - Stefano M. Iacus and Davide La Torre
- "Approximating Distribution Functions by Iterated Function Systems," math.PR/0111152
- "Nonparametric estimation of distribution and density functions in presence of missing data: an IFS approach," math.PR/0302016

- Leah Jager, Jon A. Wellner, "Goodness-of-fit tests via
phi-divergences",
Annals of Statistics
**35**(2007): 2018--2053, arxiv:math/0603238 - Thomas Jaki and and R. Webster West, "Maximum Kernel Likelihood Estimation", Journal of Computational and Graphical Statistics
**17**(2008): 976--993 - Jana Jankova, Sara van de Geer, "Confidence intervals for high-dimensional inverse covariance estimation", arxiv:1403.6752
- Jiantao Jiao, Kartik Venkat, Tsachy Weissman, "Maximum Likelihood Estimation of Functionals of Discrete Distributions", arxiv:1406.6959
- Adam M. Johansen, Arnaud Doucet and Manuel Davy, "Particle methods for maximum likelihood estimation in latent variable models", Statistics and Computing
**18**(2008) : 47--57 - Ana Justel, Daniel Pena, Ruben Zamar, "A multivariate
Kolmogorov-Smirnov test of goodness of fit", Statistics and Probability
Letters
**35**(1997): 251--259 [PDF reprint via Prof. Pena] - Paul Kabaila and Kreshna Syuhada, "The Asymptotic Efficiency of Improved Prediction Intervals", arxiv:0901.1911
- Ata Kaban, "Non-parametric detection of meaningless distances
in high dimensional data", Statistics and Computing
**22**(2011): 375--385 - Oscar Kempthorne, "The classical problem of inference--goodness of fit", Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 235--249
- D. F. Kerridge, "Inaccuracy and Inference", Journal of
the Royal Statistical Society B
**23**(1961): 184--194 - Yuichi Kitamura, "Empirical likelihood methods in econometrics: Theory and Practice", Cowles Foundation Discussion Paper No. 1569 (2006)
- Ioannis Kontoyiannis and S. P. Meyn, "Computable exponential bounds for screened estimation and simulation", Annals of Applied Probability
**18**(2008): 1491--1518, arxiv:math/0612040 - Jim Kuelbs and Anand N. Vidyashankar, "Asymptotic inference for high-dimensional data", Annals of Statistics
**38**(2010): 836--869 - Solomon Kullback, "Probability densities with given marginals,"
Annals of Mathematical Statistics
**39**(1968): 1236--1243 - Masayuki Kumon and Akimichi Takemura, "On a simple strategy weakly forcing the strong law of large numbers in the bounded forecasting game", math.PR/0508190 ["In the framework of the game-theoretic probability of Shafer and Vovk (2001) ... construct an explicit strategy weakly forcing the strong law of large numbers (SLLN) in the bounded forecasting game. ... simple finite-memory strategy based on the past average of Reality's moves, which weakly forces the strong law of large numbers with the convergence rate of $O(\sqrt{\log n/n})$.... We show that if Reality violates SLLN, then the exponential growth rate of Skeptic's capital process is explicitly described in terms of the Kullback divergence between the average of Reality's moves when she violates SLLN and the average when she observes SLLN."]
- Tze Leung Lai, Shulamith T. Gross, and David Bo Shen, "Evaluating probability forecasts", Annals of Statistics
**39**(2011): 2356--2382, arxiv:1202.5140 - Mikhail Langovoy, "Data-driven goodness-of-fit tests", arxiv:0708.0169
- Guy Lebanon, The Analysis of Data
- Stephen M. S. Lee, "Hybrid confidence regions based on data depth",
Journal of the Royal Statistical Society B
**74**(2012): 91--109 - Youngjo Lee and John A. Nelder, "Likelihood Inference for Models with Unobservables: Another View", Statistical Science
**24**(2009): 255--269, arxiv:1010.0303 [with discussion and replies following] - E. L. Lehmann and Joseph P. Romano, "Generalizations of the
Familywise Error
Rate", Annals of
Statistics
**33**(2005): 1138--1154, math.ST/0507420 - Matthieu Lerasle, "Adaptive non-asymptotic confidence balls in density estimation", arxiv:1007.4528
- M. Lerasle, R. I. Oliveira, "Robust Empirical Mean Estimators", arxiv:1112.3914
- Bo Li and Marc G. Genton, "Nonparametric Identification of Copula Structures", Journal of the American Statistical Association
**108**(2013): 666--675 - Feng Liang, Sayan Mukherjee, Mike West, "The Use of Unlabeled Data
in Predictive Modeling", Statistical Science
**22**(2007): 189--205, arxiv:0710.4618 - Perry Liang, Francis Bach, Guillaume Bouchard and Michael I. Jordan, "Asymptotically Optimal Regularization in Smooth Parametric Models" [PDF preprint via Prof. Jordan]
- Bruce G. Lindsay, Marianthi Markatou, Surajit Ray, Ke Yang, Shu-Chuan Chen, "Quadratic distances on probabilities: A unified foundation",
Annals of Statistics
**36**(2008): 983--1006, arxiv:0804.0991 - Richard A. Lockhart, "Conditional limit laws for goodness-of-fit tests", Bernoulli
**18**(2012): 857--882 - Thomas Lumley, Complex Surveys: A Guide to Analysis Using R
- Victor S. L'vov, Anna Pomyalov and Itamar Procaccia, "Outliers, Extreme Events and Multiscaling," nlin.CD/0009049
- Christian K. Machens, "Adaptive sampling by information maximization," physics/0112070
- Edouard Machery, "Power and Negative Results", Philosophy of Science
**79**(2012): 808--820 - Robert Mariano, Til Schuermann and Melyvn J. Weeks (eds.), Simulation-Based Inference in Econometrics: Methods and Applications
- Ryan Martin, Chuanhai Liu, "Inferential models: A framework for prior-free posterior probabilistic inference", arxiv:1206.4091
- McCabe and Tremayne, Modern Asymptotic Theory
- Peter McCullagh, Tensor Methods in Statistics
- Mead et al., Statistical Principles for the Design of Experiments: Applications to Real Experiments
- Nicolai Meinshausen and John Rice, "Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses", math.ST/0501289
- Alexander Meister, Deconvolution Problems in Nonparametric Statistics ["e.g., density estimation based on contaminated data, errors-in-variables regression, and image reconstruction"]
- K. L. Mengersen, P. Pudlo, C. P. Robert, "Bayesian computation via empirical likelihood", arxiv:1205.5658
- Vladimir N. Minin, John D. O'Brien, Arseni Seregin, "Empirically corrected estimation of complete-data population summaries under model misspecification", arxiv:0911.0930
- David Mumford and Agnes Desolneux, Pattern Theory: The Stochastic Analysis of Real-World Signals
- Richard Nickl, "Donsker-type theorems for nonparametric maximum
likelihood estimators", Probability Theory and
Related Fields
**138**(2007): 411--449 - Andrey Novikov, "Sequential multiple hypothesis testing in presence of control variables", Kybernetika
**45**(2009): 507--528, arxiv:0812.2712 - Wojciech Olszewski, Alvaro Sandroni, "A nonmanipulable test",
Annals of Statistics
**37**(2009): 1013--1039, arxiv:0904.0338 - Giulio Palombo, "Multivariate Goodness of Fit Procedures for Unbinned Data: An Annotated Bibliography", arxiv:1102.2407
- Leandro Pardo, Statistical Inference Based on Divergence Measures
- Hanxiang Peng and Anton Schick, "Empirical likelihood approach to goodness of fit testing", Bernoulli
**19**(2013): 954--981 - William Perkins, Mark Tygert, Rachel Ward, "Significance testing without truth", arxiv:1301.1208
- Mario Peruggia, Jason Hsu, and Yifan Huang, "Cartesian displays of many interval estimates", Electronic Journal of Statistics
**7**(2013): 91--104 - Tomaz Podobnik and Tomi Zivko, "On Consistent and Calibrated Inference about the Parameters of Sampling Distributions", physics/0508017
- Thorsten Poeschel, Werner Ebeling, and Helge Rose, "Guessing
probability distributions from small samples",
Journal of Statistical Physics
**80**(1995): 1443, cond-mat/0203467 - Dimitris N. Politis, Model-Free Prediction and Regression: A Transformation-Based Approach to Inference
- David Pollard, "Some thoughts on Le Cam's statistical decision theory", arxiv:1107.3811
- Benedikt M. Pötscher, "Confidence Sets Based on Sparse Estimators are Necessarily Large", arxiv:0711.1036
- Joel Predd, Robert Seiringer, Elliott H. Lieb, Daniel Osherson, Vincent Poor, Sanjeev Kulkarni, "Probabilistic coherence and proper scoring rules",
IEEE Transactions on Information Theory
**55**(2009): 4786, arxiv:0710.3183 - Ramsay and Silverman, Functional Data Analysis
- C. Radhakrishna Rao, "Diversity: Its measurement, decomposition,
apportionment and analysis", Sankhya: The Indian Journal of
Statistics
**44(A)**(1982): 1--22 [Sankhya is not in JSTOR! Why is Sankhya not in JSTOR?!?!] - R.-D. Reiss and M. Thomas, Statistical Analysis of Extreme Values: With Applications to Insurance, Finance, Hydrology and Other Fields
- Irina Rish, Sparse Modeling; Theory, Algorithms, and Applications
- Irina Rish et al. (eds.), Practical Applications of Sparse Modeling
- James Robins, Lingling Li, Eric Tchetgen, Aad van der Vaart, "Higher order influence functions and minimax estimation of nonlinear functionals", arxiv:0805.3040 `
- James Robins and Aad van der Vaart, "Adaptive
nonparametric confidence sets", Annals of Statistics
**34**(2006): 229--253, arxiv:math/0605473 ["We construct honest confidence regions for a Hilbert space-valued parameter in various statistical models. The confidence sets can be centered at arbitrary adaptive estimators, and have diameter which adapts optimally to a given selection of models."] - Sylvain Rubenthaler, Tobias Ryden and Magnus Wiktorsson, "Fast simulated annealing in $\R^d$ and an application to maximum likelihood estimation", math.PR/0609353
- Birgit Rudloff, Ioannis Karatzas, "Testing composite hypotheses via convex duality", Bernoulli
**16**(2010): 1224--1239, arxiv:0809.4297 - Susanne M. Schennach, "Point estimation with exponentially tilted empirical likelihood", Annals of Statistics
**35**(2007): 634--672, arxiv:0708.1874 - Tore Schweder and Nils Lid Hjort, Confidence, Likelihood, Probability: Statistical Inference with Confidence Distributions
- Emilio Seijo and Bodhisattva Sen, "A continuous mapping theorem for the smallest argmax functional", Electronic Journal of Statistics
**5**(2011): 421--439 - Haochang Shou, Russell T. Shinohara, Han Liu, Daniel Reich, and Ciprian Crainiceanu, "Soft Null Hypotheses: A Case Study of Image Enhancement Detection in Brain Lesions", Johns Hopkins University, Dept. of Biostatistics Working Paper 257
- Ricardo Silva, Robert B. Gramacy, "Gaussian Process Structural Equation Models with Latent Variables", arxiv:1002.4802 [Heard the talk at UAI 2010, but I want the details.]
- Kesar Singh, Minge Xie, William E. Strawderman, "Confidence distribution (CD) -- distribution estimator of a parameter", pp. 132--150 in Regina Liu, William Strawderman and Cun-Hui Zhang (eds.), Complex Datasets and Inverse Problems: Tomography, Networks and Beyond
- Karline Soetaert, Thomas Petzoldt, "Inverse Modelling, Sensitivity and Monte Carlo Analysis in R Using Package FME", Journal of Statistical Software
**33**(2010): 3 - Jascha Sohl-Dickstein, "The Natural Gradient by Analogy to Signal Whitening, and Recipes and Tricks for its Use", arxiv:1205.1828
- Jascha Sohl-Dickstein, Peter Battaglino, Michael R. DeWeese, "Minimum Probability Flow Learning", arxiv:0906.4779
- Christopher G. Small, The Statistical Theory of Shape
- Aris Spanos
- "Revisiting the Omitted Variables Argument: Substantive vs. Statistical Adequacy" [PDF preprint]
- "Is Frequentist Testing Vulnerable to the Base-Rate Fallacy?", Philosophy of Science
**77**(2010): 565--583

- Vladimir Spokoiny
- "A penalized exponential risk bound in parametric estimation", arxiv:0903.1721
- "Parametric estimation. Finite sample theory", Annals of Statistics
**40**(2012): 2877--2909, arxiv:1111.3029

- Pablo Sprechmann, Ignacio Ramírez, Guillermo Sapiro, Yonina Eldar, "C-HiLasso: A Collaborative Hierarchical Sparse Modeling Framework", arxiv:1006.1346
- Johan A.K. Suykens, Carlos Alzate, and Kristiaan Pelckmans, "Primal and dual model representations in kernel-based learning",
Statistics Surveys
**4**(2010): 148--183 - Olivier Thas, Comparing Distributions [mostly about goodness-of-fit tests]
- F. V. Tkachov, "Quasi-optimal observables: Attaining the quality of maximal likelihood in parameter estimation when only a MC event generator is available," physics/0108030
- Samuel Vaiter, Mohammad Golbabaee, Jalal Fadili, Gabriel Peyré, "Model Selection with Piecewise Regular Gauges", arxiv:1307.2342
- Mark J. van der Laan and Sherri Rose, Targeted Learning: Causal Inference for Observational and Experimental Data
- Aki Vehtari and Janne Ojanen, "A survey of Bayesian predictive methods for model assessment, selection and comparison", Statistics Surveys
**6**(2012): 142--228 - Guenther Walther, "The Average Likelihood Ratio for Large-scale Multiple Testing and Detecting Sparse Mixtures", arxiv:1111.0328
- Xiaogang Wang and James V. Zidek, "Selecting likelihood weights by
cross-validation", Annals
of Statistics
**33**(2005): 463--500, math.ST/0505599 - Fabian L. Wauthier, Michael I. Jordan, "Heavy-Tailed Processes for Selective Shrinkage", arxiv:1006.3901
- Holger Wendland, Scattered Data Approximation
- Halbert White
- Asymptotic Theory for Econometricians [Useful source, it seems, for non-IID central limit theorems]
- "A Reality Check for Data Snooping",
Econometrica
**68**(2000): 1097--1126

- Christopher K. I. Williams, "How to Pretend That Correlated
Variables Are Independent by Using Difference
Observations", Neural
Computation
**17**(2005): 1--6 - Wei Biao Wu, "On false discovery control under dependence", Annals of Statistics
**36**(2008): 364--380, arxiv:0903.1971 - Yuhong Yang and Andrew Barron,
"Information-theoretic determination of minimax rates of convergence",
Annals of Statistics
**27**(1999): 1564--1599