Regression, especially Nonparametric Regression
25 Nov 2024 22:22
"Regression", in statistical jargon, is the problem of guessing the average level of some quantitative response variable from various predictor variables.
Linear regression is perhaps the single most common quantitative tool in economics, sociology, and many other fields; it's certainly the most common use of statistics. (Analysis of variance, arguably more common in psychology and biology, is a disguised form of regression.) While linear regression deserves a place in statistics, that place should be nowhere near as large and prominent as it currently is. There are very few situations where we actually have scientific support for linear models. Fortunately, very flexible nonlinear regression methods now exist, and from the user's point of view are just as easy as linear regression, and at least as insightful. (Regression trees and additive models, in particular, are just as interpretable.) At the very least, if you do have a particular functional form in mind for the regression, linear or otherwise, you should use a non-parametric regression to test the adequacy of that form.
From a technical point of view, the main drawback of modern regression methods is that their extra flexibility comes at the price of less "efficiency" --- estimates converge more slowly, so you have less precision for the same amount of data. There are some situations where you'd prefer to have more precise estimates from a bad model than less precise estimates from a model which makes smaller systematic errors, but I don't think that's what most users of linear regression are chosing to do; they're just taught to type lm rather than gam. In this day and age, though, I don't understand why not.
(Of course, for the statistician, a lot of the more flexible regression methods look more or less like linear regression in some disguised form, because fundamentally all it does is project on to a function basis. So it's not crazy to make it a foundational topic for statisticians. We should not, however, give the rest of the world the impression that the hat matrix is the source of all knowledge.)
The use of regression, linear or otherwise, for causal inference, rather than prediction, is a different, and far more sordid, story.
- See also:
- Additive Models, Generalized Additive Models, etc.
- Computational Statistics
- Data Mining
- Kernel Methods
- Learning Theory
- Model Selection
- Nearest Neighbors
- Neural Nets
- Nonparametric Confidence Sets for Functions (for nonparametric regression)
- Optimal Linear Prediction and Estimation
- Social Science Methodology
- (Decision, Classification, Regression, Prediction) Trees in Statistics and Machine Learning
- What Is the Right Null Model for Linear Regression?
- Recommended, more general:
- Richard A. Berk
- Julian J. Faraway
- Linear Models with R
- Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models
- Trevor Hastie and Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction [This is a corner-stone book, but is about much, much more than just regression.]
- Jeffrey S. Racine, "Nonparametric Econometrics: A Primer", Foundations and Trends in Econometrics 3 (2008): 1--88 [Good primer of nonparametric techniques for regression, density estimation and hypothesis testing; next to no economic content (except for examples). PDF reprint]
- Jeffrey S. Simonoff, Smoothing Methods in Statistics
- Larry Wasserman
- All of Statistics
- All of Nonparametric Statistics
- Notes for 36-707, Regression Analysis
- Sanford Weisberg, Applied Linear Regression
- Recommended, more specialized:
- Norman H. Anderson and James Shanteau, "Weak inference with linear models", Psychological Bulletin 84 (1977): 1155--1170 [A demonstration of why you should not rely on $R^2$ to back up your claims]
- Mikhail Belkin, Partha Niyogi, Vikas Sindhwani, "Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples", Journal of Machine Learning Research 7 (2006): 2399--2434
- Peter J. Bickel and Bo Li, "Local polynomial regression on unknown manifolds", pp. 177--186 in Regina Liu, William Strawderman and Cun-Hui Zhang (eds.), Complex Datasets and Inverse Problems: Tomography, Networks and Beyond (2007) ["`naive' multivariate local polynomial regression can adapt to local smooth lower dimensional structure in the sense that it achieves the optimal convergence rate for nonparametric estimation of regression functions ... when the predictor variables live on or close to a lower dimensional manifold"]
- Michael H. Birnbaum, "The Devil Rides Again: Correlation as an Index of Fit", Psychological Bulletin 79 (1973): 239--242
- Lawrence D. Brown and Mark G. Low, "Asymptotic Equivalence of Nonparametric Regression and White Noise", Annals of Statistics 24 (1996): 2384--2398 [JSTOR]
- Peter Bühlmann, M. Kalisch and M. H. Maathuis, "Variable selection in high-dimensional linear models: partially faithful distributions and the PC-simple algorithm", Biometrika 97 (2010): 261--278
- Peter Bühlmann and Sara van de Geer, Statistics for High-Dimensional Data: Methods, Theory and Applications [State-of-the art (2011) compendium of what's known about using high-dimensional regression, especially but not just the Lasso.]
- A. Buja, R. Berk, L. Brown, E. George, E. Pitkin, M. Traskin, K. Zhan, L. Zhao, "Models as Approximations: How Random Predictors and Model Violations Invalidate Classical Inference in Regression", arxiv:1404.1578
- Raymond J. Carroll, Aurore Delaigle, and Peter Hall, "Nonparametric Prediction in Measurement Error Models", Journal of the American Statistical Association 104 (2009): 993--1003
- Raymond J. Carroll, J. D. Maca and D. Ruppert, "Nonparametric regression in the presence of measurement error", Biometrika 86 (1999): 541--554
- Kevin A. Clarke, "The Phantom Menace: Omitted Variables Bias in Econometric Research" [PDF. Or: Kitchen-sink regressions considered harmful. Including extra variables in your linear regression may or may not reduce the bias in your estimate of any particular coefficients of interest, depending on the correlations between the added variables, the predictors of interest, the response, and omitted relevant variables. Adding more variables always increases the variance of your estimates.]
- Eduardo Corona, Terran Lane, Curtis Storlie, Joshua Neil, "Using Laplacian Methods, RKHS Smoothing Splines and Bayesian Estimation as a framework for Regression on Graph and Graph Related Domains" [Technical report, University of New Mexico Computer Science, 2008-06, PDF]
- Paramveer S. Dhillon, Dean P. Foster, Sham M. Kakade, Lyle H. Ungar, "A Risk Comparison of Ordinary Least Squares vs. Ridge Regression", Journal of Machine Learning Research 14 (2013): 1505--1511
- William H. DuMouchel and Greg J. Duncan, "Using Sample Survey Weights in Multiple Regression Analysis of Stratified Samples", Proceedings of the Survey Research Methods Section, American Statistical Association (1981), pp. 629--637 [PDF reprint; presumably very similar to "Using Sample Survey Weights to Compare Various Linear Regression Models", Journal of the American Statistical Association 78 (1983): 535--543, but I have not looked at the latter]
- Andrew Gelman and Iain Pardoe, "Average predictive comparisons for models with nonlinearity, interactions, and variance components", Sociological Methodology forthcoming (2007) [PDF preprint, Gelman's comments]
- Lee-Ad Gottlieb, Aryeh Kontorovich, Robert Krauthgamer, "Efficient Regression in Metric Spaces via Approximate Lipschitz Extension", arxiv:1111.4470
- Lászlo Györfi, Michael Kohler, Adam Krzyzak and Harro Walk, A Distribution-Free Theory of Nonparametric Regression
- Berthold R. Haag, "Non-parametric Regression Tests Using Dimension Reduction Techniques", Scandinavian Journal of Statistics 35 (2008): 719--738
- Peter Hall, "On Bootstrap Confidence Intervals in Nonparametric Regression", Annals of Statistics 20 (1992): 695--711
- Peter Hall and Joel Horowitz, "A simple bootstrap method for constructing nonparametric confidence bands for functions", Annals of Statistics 41 (2013): 1892--1921, arxiv:1309.4864
- W. Härdle and E. Mammen, "Comparing Nonparametric Versus Parametric Regression Fits", Annals of Statistics 21 (1993): 1926--1947
- Jeffrey D. Hart, Nonparametric Smoothing and Lack-of-Fit Tests
- Yongmiao Hong and Halbert White, "Consistent Specification Testing Via Nonparametric Series Regression", Econometrica 63 (1995): 1133--1159 [JSTOR]
- Adel Javanmard, Andrea Montanari, "Confidence Intervals and Hypothesis Testing for High-Dimensional Regression", arxiv:1306.3171
- M. Kohler, A. Krzyzak and D. Schafer, "Application of structural risk minimization to multivariate smoothing spline regression estimates", Bernoulli 8 (2002): 475--490
- Alexander Korostelev, "A minimaxity criterion in nonparametric regression based on large-deviations probabilities", Annals of Statistics 24 (1996): 1075--1083
- Jon Lafferty and Larry Wasserman [To be honest, I haven't checked to see how different these two papers actually are...]
- "Rodeo: Sparse Nonparametric Regression in High Dimensions", math.ST/0506342
- "Rodeo: Sparse, greedy nonparametric regression", Annals of Statistics 36 (2008): 27--63, arxiv:0803.1709
- Diane Lambert and Kathryn Roeder, "Overdispersion Diagnostics for Generalized Linear Models", Journal of the American Statistical Association 90 (1995): 1225--1236 [JSTOR]
- Abdelkader Mokkadem, Mariane Pelletier, Yousri Slaoui, "Revisiting Révész's stochastic approximation method for the estimation of a regression function", arxiv:0812.3973
- Patrick O. Perry, "Fast Moment-Based Estimation for Hierarchical Models", arxiv:1504.04941
- Garvesh Raskutti, Martin J. Wainwright, and Bin Yu, "Early stopping and non-parametric regression: An optimal and data-dependent stopping rule", arxiv:1306.3574
- B. W. Silverman, "Spline Smoothing: The Equivalent Variable Kernel Method", Annals of Statistics 12 (1984): 898--916
- Ryan J. Tibshirani, "Degrees of Freedom and Model Search", arxiv:1402.1920
- Gerhard Tutz, Regression for Categorical Data
- Sara van de Geer, Empirical Process Theory in M-Estimation
- Grace Wahba, Spline Models for Observational Data
- Jianming Ye, "On Measuring and Correcting the Effects of Data Mining and Model Selection", Journal of the American Statistical Association 93 (1998): 120--131
- Recommended, historical:
- Erich L. Lehmann, "On the history and use of some standard statistical models", pp. 114--126 in Deborah Nolan and Terry Speed (eds.), Probability and Statistics: Essays in Honor of David A. Freedman
- E. T. Whittaker, "On a New Method of Graduation", Proceedings of the Edinburgh Mathematical Society 41 (1922): 63--75 [Introduces splines, complete with the Bayesian derivation (if you are in to that sort of thing), though without the name.]
- Modesty forbids me to recommend:
- Advanced Data Analysis from an Elementary Point of View [Presumes at least some acquaintance with linear regression, however]
- The Truth About Linear Regression [Draft textbook for a first course in linear regression for undergraduates]
- To read, teaching:
- Adrian W. Bowman and Adelchi Azzalini, Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations
- Andrew Gelman and Jennifer Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- To read, learning:
- Elena Andreou and Bas J. M. Werker, "An Alternative Asymptotic Analysis of Residual-Based Statistics", Review of Economics and Statistics 94 (2012): 88--99
- Sylvain Arlot, "Choosing a penalty for model selection in heteroscedastic regression", arxiv:0812.3141
- Sylvain Arlot and Pascal Massart, "Data-driven Calibration of Penalties for Least-Squares Regression", Journal of Machine Learning Research 10 (2009): 245--279
- Anil Aswani, Peter Bickel, and Claire Tomlin, "Regression on manifolds: Estimation of the exterior derivative", Annals of Statistics 39 (2011): 48--81
- Jean-Baptiste Aubin, Samuela Leoni-Aubin, "A Simple Misspecification Test for Regression Models", arxiv:1003.2294
- Jean-Yves Audibert and Olivier Catoni, "Robust linear least squares regression", Annals of Statistics 39 (2011): 2766--2794
- Alexandre Belloni, Victor Chernozhukov, "High Dimensional Sparse Econometric Models: An Introduction", arxiv:1106.5242
- Gilles Blanchard, Nicole Kraemer, "Kernel Conjugate Gradient is Universally Consistent", arxiv:0902.4380 ["approximate solutions are constructed by projections onto a nested set of data-dependent subspaces"]
- Borowiak, Model Discrimination for Nonlinear Regression Models
- Lawrence D. Brown, T. Tony Cai, and Harrison H. Zhou, "Nonparametric regression in exponential families", Annals of Statistics 38 (2010): 2005--2046
- Peter Bühlmann, "Statistical significance in high-dimensional linear models", arxiv:1202.1377 [Not sure if this goes beyond what's in Bühlmann and van de Geer]
- Florentina Bunea, Seth Strimas-Mackey, Marten Wegkamp, "Interpolating Predictors in High-Dimensional Factor Regression", Journal of Machine Learning Research 23 (2022): 10
- T. Tony Cai, "Minimax and Adaptive Inference in Nonparametric Function Estimation", Statistical Science 27 (2012): 31--50, arxiv:1203.4911
- T. Tony Cai, Harrison H. Zhou, "Asymptotic equivalence and adaptive estimation for robust nonparametric regression", Annals of Statistics 37 (2009): 3204--3235, arxiv:0909.0343
- Andrew V. Carter, "Asymptotic approximation of nonparametric regression experiments with unknown variances", Annals of Statistics 35 (2007): 1644--1673, arxiv:0710.3647
- Ming-Yen Cheng, Hau-tieng Wu, "Local Linear Regression on Manifolds and its Geometric Interpretation", arxiv:1201.0327
- Laëtitia Comminges, Arnak Dalalyan, "Tight conditions for consistent variable selection in high dimensional nonparametric regression", arxiv:1102.3616
- R. Dennis Cook, Liliana Forzani, and Adam J. Rothman, "Estimating sufficient reductions of the predictors in abundant high-dimensional regressions", Annals of Statistics 40 (2012): 353--384
- Arnak Dalalyan and Alexandre B. Tsybakov, "Sparse Regression Learning by Aggregation and Langevin Monte-Carlo", arxiv:0903.1223
- Laurie Davies, Lutz Dümbgen, "A Model-free Approach to Linear Least Squares Regression with Exact Probabilities and Applications to Covariate Selection", arxiv:1906.01990
- Robert Davies, Christopher Withers, and Saralees Nadarajah, "Confidence intervals in a regression with both linear and non-linear terms", Electronic Journal of Statistics 5 (2011): 603--618
- Aurore Delaigle, Peter Hall, Hans-Georg Müller, "Accelerated convergence for nonparametric regression with coarsened predictors", Annals of Statistics 35 (2007): 2639--2653, arxiv:0803.3017
- Ruben Dezeure, Peter Bühlmann, Lukas Meier, Nicolai Meinshausen, "High-dimensional Inference: Confidence intervals, p-values and R-Software hdi", arxiv:1408.4026
- Charanpal Dhanjal, Nicolas Baskiotis, Stéphan Clémen&ccdeil;on and Nicolas Usunier, "An Empirical Comparison of V-fold Penalisation and Cross Validation for Model Selection in Distribution-Free Regression", arxiv:1212.1780
- Wei Dou, David Pollard, Harrison H. Zhou, "Functional regression for general exponential families", arxiv:1001.3742
- Sam Efromovich
- Nonparametric Curve Estimation
- "Conditional density estimation in a regression setting", Annals of Statistics 35 (2007): 2504--2535, arxiv:0803.2984
- P. P. B. Eggermont, V. N. LaRiccia, "Uniform error bounds for smoothing splines", arxiv:math/0612776
- P. P. B. Eggermont and V. N. LaRiccia, Maximum Penalized Likelihood Estimation, vol. II: Regression [Enthusiastic review in JASA (104 (2010): 1628), appears self-contained]
- Jianqing Fan, Shaojun Guo and Ning Hao, "Variance estimation using refitted cross-validation in ultrahigh dimensional regression", Journal of the Royal Statistical Society B 74 (2012): 37--65
- Cheryl J. Flynn, Clifford M. Hurvich, Jeffrey S. Simonoff, "On the Sensitivity of the Lasso to the Number of Predictor Variables", arxiv:1403.4544
- Jose M. Gonzalez-Barrios and Silvia Ruiz-Velasco, "Regression analysis and dependence", Metrica 61 (2005): 73--87
- Juan M Gorriz, J. Ramirez, F. Segovia, F. J. Martinez-Murcia, C. Jiménez-Mesa, J. Suckling, "Statistical Agnostic Regression: a machine learning method to validate regression models", arxiv:2402.15213 [I am intensely skeptical, but I should read this before dismissing it.]
- Marvin H. J. Gruber, Regression Estimators: A Comparative Study
- Chong Gu, Smoothing Spline ANOVA Models
- Haijie Gu, John Lafferty, "Sequential Nonparametric Regression", arxiv:1206.6408
- Emmanuel Guerre and Pascal Lavergne, "Data-driven rate-optimal specification testing in regression models", Annals of Statistics 33 (2005): 840--870, math.ST/0505640
- P. Richard Hahn, Sayan Mukherjee, Carlos Carvalho, "Predictor-dependent shrinkage for linear regression via partial factor modeling", arxiv:1011.3725
- Peter Hall, Joel L. Horowitz, "Nonparametric methods for inference in the presence of instrumental variables", Annals of Statistics 33 (2005): 2904--2929, arxiv:math/0603130
- Bruce E. Hansen
- "Uniform Convergence Rates for Kernel Estimation with Dependent Data", Econometric Theory 24 (2008): 726--748 [abstract with link to free PDF]
- Econometrics
- Wolfgang Härdle, Applied Nonparametric Regression
- Wolfgang Härdle, Marlene Müller, Stefan Sperlich and Axel Werwatz, Nonparametric and Semiparametric Models: An Introduction
- Jeffrey D. Hart, "Smoothing-inspired lack-of-fit tests based on ranks", arxiv:0805.2285
- Elad Hazan, Tomer Koren, "Linear Regression with Limited Observation", arxiv:1206.4678
- Mohamed Hebiri and Sara A. Van De Geer, "The Smooth-Lasso and other $\ell_1+\ell_2$-penalized methods", arxiv:1003.4885
- Nancy Heckman, "The theory and application of penalized methods or Reproducing Kernel Hilbert Spaces made easy", arxiv:1111.1915
- Tim Hesterberg, Nam Hee Choi, Lukas Meier, Chris Fraley, "Least angle and $\ell_1$ penalized regression: A review", Statistics Surveys 2 (2008): 61--93, arxiv:0802.0964
- Jacob Hinkle, Prasanna Muralidharan, P. Thomas Fletcher, Sarang Joshi, "Polynomial Regression on Riemannian Manifolds", arxiv:1201.2395
- Giles Hooker and Saharon Rosset, "Prediction-based regularization using data augmented regression", Statistics and Computing 22 (2011): 237--249
- Joel L. Horowitz, Enno Mammen, "Rate-optimal estimation for a general class of nonparametric regression models with unknown link functions", Annals of Statistics 35 (2007): 2589--2619, arxiv:0803.2999
- Torsten Hothorn, Thomas Kneib, Peter Bühlmann, "Conditional transformation models", Journal of the Royal Statistical Society B forthcoming
- Salvatore Ingrassia, Simona C. Minotti, Giorgio Vittadini, "Local statistical modeling by cluster-weighted" [sic], arxiv:0911.2634 [Revisiting Gershenfeld et al.'s "cluster-weighted modeling" from a more properly statistical perspective]
- Sameer M. Jalnapurkar, "Learning a regression function via Tikhonov regularization", math.ST/0509420
- Bo Kai, Runze Li and Hui Zou, "Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression", Journal of the Royal Statistical Society B 72 (2010): 49--69
- Gerard Kerkyacharian, Mathilde Mougeot, Dominique Picard, Karine Tribouley, "Learning Out of Leaders", arxiv:1001.1919
- Estate V. Khmaladze, Hira L. Koul, "Goodness-of-fit problem for errors in nonparametric regression: Distribution free approach", Annals of Statistics 37 (2009): 3165--3185, arxiv:0909.0170
- Heeyoung Kim and Xiaoming Huo, "Asymptotic optimality of a multivariate version of the generalized cross validation in adaptive smoothing splines", Electronic Journal of Statistics 8 (2014): 159--183
- Hoyt Koepke, Mikhail Bilenko, "Fast Prediction of New Feature Utility", arxiv:1206.4680
- Michael R. Kosorok, Introduction to Empirical Processes and Semiparametric Inference [partial PDF preprint]
- Nicole Kraemer, Anne-Laure Boulesteix, Gerhard Tutz, "Penalized Partial Least Squares Based on B-Splines Transformations", math.ST/0608576
- Tatyana Krivobokova, Thomas Kneib, and Gerda Claeskens, "Simultaneous Confidence Bands for Penalized Spline Estimators", Journal of the American Statistical Association 105 (2010): 852--863
- Arne Kovac, Andrew D.A.C. Smith, "Regression on a Graph", Journal of Computational and Graphical Statistics 20 (2011): 432--447, arxiv:0911.1928
- Tatyana Krivobokova, "Smoothing parameter selection in two frameworks for penalized splines", Journal of the Royal Statistical Society B 75 (2013): 725--741
- Rafal Kulik and Cornelia Wichelhaus, "Nonparametric conditional variance and error density estimation in regression models with dependent errors and predictors", Electronic Journal of Statistics 5 (2011): 856--898
- Pascal Lavergne, Samuel Maistre, and Valentin Patilea, "A significance test for covariates in nonparametric regression", Electronic Journal of Statistics 9 (2015): 643--678
- Tri M. Le, Bertrand S. Clarke, "Model Averaging Is Asymptotically Bevtter Than Model Selection For Prediction", Journal of Machine Learning Research 23 (2022): 33
- Hannes Leeb, "Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process", Bernoulli 14 (2008): 661--690, arxiv:0802.3364
- Qi Li and Jeffrey Scott Racine, Nonparametric Econometrics: Theory and Practice
- Yehua Li and Tailen Hsing, "Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data", Annals of Statistics 38 (2010): 3321--3351
- Heng Lian, "Convergence of Nonparametric Functional Regression Estimates with Functional Responses", arxiv:1111.6230
- Han Liu, Xi Chen, John Lafferty and Larry Wasserman, "Graph-Valued Regression", NIPS 23 (2010) [PDF], arxiv:1006.3972
- Oliver Linton and Zhijie Xiao, "A Nonparametric Regression Estimator That Adapts To Error Distribution of Unknown Form", Econometric Theory 23 (2007): 371--413
- James Robert Lloyd, David Duvenaud, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani, "Automatic Construction and Natural-Language Description of Nonparametric Regression Models", arxiv:1402.4304
- Po-Ling Loh, Martin J. Wainwright, "High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity", Annals of Statistics 40 (2012): 1637--1664, arxiv:1109.3714
- Djamal Louani, Sidi Mohamed Ould Maouloud, "Large Deviation Results for the Nonparametric Regression Function Estimator on Functional Data", arxiv:1111.5989
- Enno Mammen, Christoph Rothe, and Melanie Schienle, "Nonparametric regression with nonparametrically generated covariates", Annals of Statististics 40 (2012): 1132--1170
- Charles E. McCulloch, John M. Neuhaus, "Misspecifying the Shape of a Random Effects Distribution: Why Getting It Wrong May Not Matter", Statistical Science @6 (2011): 388--402, arxiv:1201.1980
- Hugh Miller and Peter Hall, "Local polynomial regression and variable selection", arxiv:1006.3342
- Jessica Minnier, Lu Tian and Tianxi Cai, "A Perturbation Method for Inference on Regularized Regression Estimates", Journal of the American Statistical Association 106 (2011): 1371--1382
- Ursula U. Müller and Ingrid Van Keilegom, "Efficient parameter estimation in regression with missing responses", Electronic Journal of Statistics 6 (2012): 1200--1219
- Richard Nickl and Sara van de Geer, "Confidence sets in sparse regression", Annals of Statistics 41 (2013): 2852--2876, arxiv:1209.1508
- Andriy Norets, "Approximation of conditional densities by smooth mixtures of regressions", Annals of Statistics 38 (2010): 1733--1766, arxiv:1010.0581
- Philippe Rigollet, "Maximum likelihood aggregation and misspecified generalized linear models", arxiv:0911.2919
- Cynthia Rudin, "Stability Analysis for Regularized Least Squares Regression", cs.LG/0502016
- Laura M. Sangalli, James O. Ramsay, Timothy O. Ramsay, "Spatial spline regression models", Journal of the Royal Statistical Society B 75 (2013): 681--703
- George A. F. Seber and C. J. Wild, Nonlinear Regression
- Arnab Sen, Bodhisattva Sen, "On Testing Independence and Goodness-of-fit in Linear Models", arxiv:1302.5831
- Zuofeng Shang and Guang Cheng, "Local and global asymptotic inference in smoothing spline models", Annals of Statistics 41 (2013): 2608--2638
- David Shilane, Richard H. Liang and Sandrine Dudoit, "Loss-Based Estimation with Evolutionary Algorithms and Cross-Validation", UC Berkeley Biostatistics Working Paper 227 [Abstract, PDF]
- Tom A. B. Snijders and Johannes Berkhof, "Diagnostic Checks for Multilevel Models"
- Emre Soyer and Robin M. Hogarth, "The illusion of predictability: How regression statistics mislead experts" [PDF preprint]
- Aris Spanos, "Revisiting the Omitted Variables Argument: Substantive vs. Statistical Adequacy" [PDF preprint]
- Pablo Sprechmann, Igancio Ramirez, Guillermo Sapiro and Yonina C. Eldar, "C-HiLasso: A Collaborative Hierarchical Sparse Modeling Framework", arxiv:1006.1346
- Ingo Steinwart and Andreas Christmann, Support Vector Machines
- Curtis B. Storlie, Howard D. Bondell, and Brian J. Reich, "A Locally Adaptive Penalty for Estimation of Functions With Varying Roughness", Journal of Computational and Graphical Statistics (2010): forthcoming
- Liangjun Su and Aman Ullah, "Local polynomial estimation of nonparametric simultaneous equations models", Journal of Econometrics 144 (2008): 193--218
- Ryan J. Tibshirani, "The Lasso Problem and Uniqueness", arxiv:1206.0313
- Jo-Anne Ting, Aaron D'Souza, Sethu Vijayakumar and Stefan Schaal, "Efficient Learning and Feature Selection in High-Dimensional Regression", Neural Computation 22 (2010): 831--886
- Daniell Toth and John L. Eltinge, "Building Consistent Regression Trees From Complex Sample Data", Journal of the American Statistical Association 106 (2011): 1626--1636
- Minh-Ngoc Tran, David Nott, Chenlei Leng, "The Predictive Lasso", arxiv:1009.2302
- Gerhard Tutz and Sebastian Petty, "Nonparametric estimation of the link function including variable selection", Statistics and Computing 22 (2011): 545--561
- Gerhard Tutz and Jan Ulbricht, "Penalized regression with correlation-based penalty", Statistics and Computing 19 (2008): 239--253
- Samuel Vaiter, Mohammad Golbabaee, Jalal Fadili, Gabriel Peyré, "Model Selection with Piecewise Regular Gauges", arxiv:1307.2342
- Sara van de Geer, Johannes Lederer, "The Lasso, correlated design, and improved oracle inequalities", arxiv:1107.0189
- Daniela M. Witten and Robert Tibshirani, "Covariance-regularized regression and classification for high dimensional problems", Journal of the Royal Statistical Society B 71 (2009): 615--636
- Yun Yang and Surya T. Tokdar, "Minimax-optimal nonparametric regression in high dimensions", Annals of Statistics 43 (2015): 652--674
- Adriano Zanin Zambom, Michael Akritas, "Nonparametric Model Checking and Variable Selection", arxiv:1205.6761
- Hao Helen Zhang, Guang Cheng and Yufeng Liu, "Linear or Nonlinear? Automatic Structure Discovery for Partially Linear Models", Journal of the American Statistical Association 106 (2011): 1099--1112 [Presumably they have a reason for not just using an additive model with an extra strong curvature penalty in each univariate smoother.]
- Zhibiao Zhao and Wei Biao Wu, "Confidence bands in nonparametric time series regression", Annals of Statistics 36 (2008): 1854--1878, arxiv:0808.1010
- Peng Zhau and Bin Yu, "On Model Selection Consistency of Lasso", Journal of Machine Learning Research 7 (2006): 2541--2563
- Hongtu Zhu, Joseph G. Ibrahim, Sikyum Lee, Heping Zhang, "Perturbation selection and influence measures in local influence analysis", Annals of Statistics 35 (2007): 2565--2588, arxiv:0803.2986
- Ying Zhu, "Phase transitions in nonparametric regressions: a curse of exploiting higher degree smoothness assumptions in finite samples", arxiv:2112.03626