Kernel Methods in Statistic and Machine Learning
Last update: 27 Feb 2026 12:16First version: 15 May 2019
Yet Another Inadequate Placeholder
I mean in the sense of kernels used to measure how similar two objects or data-points are, rather than things like kernel regression smoothing (Nadaraya-Watson smoothing) or density estimation, where the kernel is used to smooth out data. --- Why we have two distinct sets of methods with the same name is a convoluted story (you should pardon the expression).
- See also:
- Data Mining
- Hilbert Space Methods for Statistics and Probability
- Kernelized Factor Models
- Random Feature Methods in Statistics and Machine Learning [Originally, a Cool Trick for making kernel methods faster, but of independent interest]
- Regression, especially Nonparametric Regression
- Statistics
- Recommended, big picture:
- John Shawe-Taylor and Nello Cristianini, Kernel Methods for Pattern Analysis
- Recommended, historical interest:
- Emanuel Parzen [Discussed elsewhere]
- "A New Approach to the Synthesis of Optimal Smoothing and Prediction Systems", pp. 75--108 in Richard Bellman (ed.), Mathematical Optimization Techniques: Papers presented at the Symposium on Mathematical Optimization Techniques, Santa Monica, California, October 18--20, 1960 (Berkeley: University of California Press, 1963)
- "An Approach to Time Series Analysis", The Annals of Mathematical Statistics 32 (1961): 951--989 [JSTOR]
- Modesty forbids me to recommend:
- Lecture 15 for 36-462, Data Mining / Methods of Statistical Learning [I've written other accounts of kernel methods for students over the years, but this is one of the more fleshed out.]
- To read:
- Arvind Agarwal, Hal Daume III, "Generative Kernels for Exponential Families", AISTATS 2011
- Arash A. Amini and Zahra S. Razaee, "Concentration of kernel matrices with application to kernel spectral clustering", Annals of Statistics 49 (2021): 531--556
- Marco Cuturi and Kenji Fukumizu, "Multiresolution Kernels", cs.LG/0507033
- Kris De Brabanter, Jos De Brabanter, Johan A. K. Suykens and Bart De Moor, "Kernel Regression in the Presence of Correlated Errors", Journal of Machine Learning Research 12 (2011): 1955--1976
- Michiel Debruyne, Mia Hubert, Johan A.K. Suykens, "Model Selection in Kernel Based Regression using the Influence Function", Journal of Machine Learning Research 9 (2008): 2377--2400
- Robert Hable, "Asymptotic Confidence Sets for General Nonparametric Regression and Classification by Regularized Kernel Methods", arxiv:1203.4354
- Nancy Heckman, "The theory and application of penalized methods or Reproducing Kernel Hilbert Spaces made easy", arxiv:1111.1915
- Thomas Jaki and and R. Webster West, "Maximum Kernel Likelihood Estimation", Journal of Computational and Graphical Statistics 17 (2008): 976--993
- Johan A. K. Suykens, Carlos Alzate, and Kristiaan Pelckmans, "Primal and dual model representations in kernel-based learning", Statistics Surveys 4 (2010): 148--183