Density Estimation
05 Mar 2024 10:00
Yet Another Inadequate Placeholder, spun off from statistics.
Two topics of particular interest: estimating conditional densities, and estimating the densities of short subsequences from time series.
- See also:
- density estimation on graphical models
- exponential families
- independence tests and dependence measures
- learning theory
- mixture models
- Recommended:
- Luc Devorye and Gabor Lugosi, Combinatorial Methods in Density Estimation
- Peter Hall, Jeff Racine and Qi Li, "Cross-Validation and the Estimation of Conditional Probability Densities", Journal of the American Statistical Association 99 (2004): 1015--1026 [PDF]
- Jeffrey S. Racine, "Nonparametric Econometrics: A Primer", Foundations and Trends in Econometrics 3 (2008): 1--88 [Good primer of nonparametric techniques for regression, density estimation and hypothesis testing; next to no economic content (except for examples). Presumes reasonable familiarity with parametric statistics. PDF reprint]
- Jeffrey S. Simonoff, Smoothing Methods in Statistics
- Larry Wasserman
- All of Statistics
- All of Nonparametric Statistics
- Recommended, close-ups:
- Bruce E. Hansen
- "Nonparametric Conditional Density Estimation" [PDF preprint, 2004]
- "Nonparametric Estimation of Smooth Conditional Distributions" [Preprint]
- Rafael Izbicki, A Spectral Series Approach to High-Dimensional Nonparametric Inference [Ph.D. thesis, CMU statistics department, 2014]
- Rafael Izbicki, Ann Lee, Chad Schafer, "High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation", AISTATS 2014: 420--429
- Abdelkader Mokkadem, Mariane Pelletier, Yousri Slaoui, "The stochastic approximation method for the estimation of a multivariate probability density", arxiv:0807.2960
- Makoto Yamada, Taiji Suzuki, Takafumi Kanamori, Hirotaka Hachiya, Masashi Sugiyama, "Relative Density-Ratio Estimation for Robust Distribution Comparison", Neural Computation 25 (2013): 1324--1370 [This is not the relative density between \( p \) and \( q \) in the Handcock-Morris sense, just the ratio between \( p \) and \( ap+(1-a)q \), for adjustable \( a \). (This is to keep the density ratio from going to infinite anywhere.) The thing seems a bit hackish, but still worth considering...]
- Lin Yuan, Sergey Kirshner, Robert Givan, "Estimating Densities with Non-Parametric Exponential Families", arxiv:1206.5036
- Victoria Zinde-Walsh, "Nonparametric functionals as generalized functions", arxiv:1303.1435
- Modesty forbids me to recommend:
- The lecture on density estimation in my advanced data analysis class notes
- To read:
- Ethan Anderes, Marc Coram, "A general spline representation for nonparametric and semiparametric density estimates using diffeomorphisms", arxiv:1205.5314
- Andrew R. Barron and Chyong-Hwa Sheu, "Approximation of Density Functions by Sequences of Exponential Families", Annals of Statistics 19 (1991): 1347--1369
- Alain Berlinet, Gérard Biau and Laurent Rouvière, "Optimal L1 Bandwidth selection for variable kernel density estimates", Statistics and Probability Letters 74 (2005): 116--128 ["[O]ne can improve performance of kernel density estimates by varying the bandwidth with the location and/or the sample data at hand. Our interest in this paper is in the data-based selection of a variable bandwidth... an automatic selection procedure inspired by the combinatorial tools developed in Devroye and Lugosi... the expected L1 error of the corresponding selected estimate is up to a given constant multiple of the best possible error plus an additive term which tends to zero under mild assumptions"]
- Z. I. Botev, J. F. Grotowski, and D. P. Kroese, "Kernel density estimation via diffusion", Annals of Statistics 38 (2010): 2916--2957
- Blair Bilodeau, Dylan J. Foster, Daniel M. Roy, "Minimax rates for conditional density estimation via empirical entropy", Annals of Statistics 51 (2023): 762--790, arxiv:2109.10461
- Susan M. Buchman, Ann B. Lee, Chad M. Schafer, "High-Dimensional Density Estimation via SCA: An Example in the Modelling of Hurricane Tracks", arxiv:0907.0199
- Serge Cohen, Erwan Le Pennec, "Conditional Density Estimation by Penalized Likelihood Model Selection", arxiv:1103.2021
- Tilman M. Davies, Martin L. Hazelton, Jonathan. C Marshall, "sparr: Analyzing Spatial Relative Risk Using Fixed and Adaptive Kernel Density Estimation in R", Journal of Statistical Software 39:1 (2011)
- Sam Efromovich
- "Distribution estimation for biased data", Journal of Statistical Planning and Inference 124 (2004): 1--43
- "Conditional density estimation in a regression setting", Annals of Statistics 35 (2007): 2504--2535, arxiv:0803.2984
- "Dimension Reduction and Adaptation in Conditional Density Estimation", Journal of the American Statistical Association 105 (2010): 761--774
- Bradley Efron and Robert Tibshirani, "Using Specially Designed Exponential Families for Density Estimation", Annals of Statistics 24 (1996): 2431--2461
- Evarist Giné and Hailin Sang, "Uniform asymptotics for kernel density estimators with variable bandwidths", arxiv:1007.4350
- Evarist Giné and Richard Nickl
- "Adaptive estimation of a distribution function and its density in sup-norm loss by wavelet and spline projections", Bernoulli 16 (2010): 1137--1163, arxiv:0805.1404
- "Uniform limit theorems for wavelet density estimators", arxiv:0805.1406 = Annals of Probability 37 (2009): 1605--1646
- "Confidence bands in density estimation", Annals of Statistics 38 (2010): 1122--1170
- David Haussler, Manfred Opper, "Mutual information, metric entropy and cumulative relative entropy risk", Annals of Statistics 25 (1997): 2451--2492
- Han Liu, John Lafferty and Larry Wasserman, "Tree Density Estimation", arxiv:1001.1557
- Han Liu, Min Xu, Haijie Gu, Anupam Gupta, John Lafferty, Larry Wasserman, "Forest Density Estimation", Journal of Machine Learning Research 12 (2011): 907--951
- Yanyuan Ma, Jeffrey D. Hart and Raymond J. Carroll, "Density Estimation in Several Populations With Uncertain Population Membership", Journal of the American Statistical Association 106 (2011): 1180--1192
- Reason Lesego Machete, "Early Warning with Calibrated and Sharper Probabilistic Forecasts", arxiv:1112.6390
- Brendan P. M. McCabe, Gael M. Martin, David Harris, "Efficient probabilistic forecasts for counts", Journal of the Royal Statistical Society B 73 (2011): 253--272
- Andrew B. Nobel, Gusztav Morvai, Sanjeev R. Kulkarni, "Density estimation from an individual numerical sequence", IEEE Transactions on Information Theory 44 (1998): 537--541, arxiv:0710.2500
- Andriy Norets, "Approximation of conditional densities by smooth mixtures of regressions", Annals of Statistics 38 (2010): 1733--1766, arxiv:1010.0581
- Michael Nussbaum, "Asymptotic Equivalence of Density Estimation and Gaussian White Noise", Annals of Statistics 24 (1996): 2399--2430
- Alessandro Rinaldo, Aarti Singh, Rebecca Nugent, Larry Wasserman, "Stability of Density-Based Clustering", arxiv:1011.2771
- Alessandro Rinaldo and Larry Wasserman, "Generalized Density Clustering", Annals of Statistics 38 (2010): 2678--2722, arxiv:0907.3454
- Olga Y. Savchuk, Jeffrey D. Hart, and Simon J. Sheather, "Indirect Cross-Validation for Density Estimation", Journal of the American Statistical Association 105 (2010): 415--423
- Bharath Sriperumbudur, Kenji Fukumizu, Arthur Gretton, Aapo Hyv\"{a}rinen, Revant Kumar, "Density Estimation in Infinite Dimensional Exponential Families", Journal of Machine Learning Research 18:57 (2017): 1--59
- Yuefeng Wu, Subhashis Ghosal, "Kullback Leibler property of kernel mixture priors in Bayesian density estimation", Electronic Journal of Statistics 2 (2008): 298--331, arxiv:0710.2746
- Bin Yu, "Density Estimation in the $L^{\infty}$ Norm for Dependent Data with Applications to the Gibbs Sampler", Annals of Statistics 21 (1993): 711--735
- Adriano Zanin Zambom, Ronaldo Dias, "A Review of Kernel Density Estimation with Applications to Econometrics", arxiv:1212.2812