Kernel Methods in Statistic and Machine Learning

Last update: 27 Feb 2026 12:16
First version: 15 May 2019

Yet Another Inadequate Placeholder

I mean in the sense of kernels used to measure how similar two objects or data-points are, rather than things like kernel regression smoothing (Nadaraya-Watson smoothing) or density estimation, where the kernel is used to smooth out data. --- Why we have two distinct sets of methods with the same name is a convoluted story (you should pardon the expression).

Data Mining
Hilbert Space Methods for Statistics and Probability
Kernelized Factor Models
Random Feature Methods in Statistics and Machine Learning [Originally, a Cool Trick for making kernel methods faster, but of independent interest]
Regression, especially Nonparametric Regression
Statistics

John Shawe-Taylor and Nello Cristianini, Kernel Methods for Pattern Analysis

Emanuel Parzen [Discussed elsewhere]
- "A New Approach to the Synthesis of Optimal Smoothing and Prediction Systems", pp. 75--108 in Richard Bellman (ed.), Mathematical Optimization Techniques: Papers presented at the Symposium on Mathematical Optimization Techniques, Santa Monica, California, October 18--20, 1960 (Berkeley: University of California Press, 1963)
- "An Approach to Time Series Analysis", The Annals of Mathematical Statistics 32 (1961): 951--989 [JSTOR]

Lecture 15 for 36-462, Data Mining / Methods of Statistical Learning [I've written other accounts of kernel methods for students over the years, but this is one of the more fleshed out.]

Arvind Agarwal, Hal Daume III, "Generative Kernels for Exponential Families", AISTATS 2011
Arash A. Amini and Zahra S. Razaee, "Concentration of kernel matrices with application to kernel spectral clustering", Annals of Statistics 49 (2021): 531--556
Marco Cuturi and Kenji Fukumizu, "Multiresolution Kernels", cs.LG/0507033
Kris De Brabanter, Jos De Brabanter, Johan A. K. Suykens and Bart De Moor, "Kernel Regression in the Presence of Correlated Errors", Journal of Machine Learning Research 12 (2011): 1955--1976
Michiel Debruyne, Mia Hubert, Johan A.K. Suykens, "Model Selection in Kernel Based Regression using the Influence Function", Journal of Machine Learning Research 9 (2008): 2377--2400
Robert Hable, "Asymptotic Confidence Sets for General Nonparametric Regression and Classification by Regularized Kernel Methods", arxiv:1203.4354
Nancy Heckman, "The theory and application of penalized methods or Reproducing Kernel Hilbert Spaces made easy", arxiv:1111.1915
Thomas Jaki and and R. Webster West, "Maximum Kernel Likelihood Estimation", Journal of Computational and Graphical Statistics 17 (2008): 976--993
Johan A. K. Suykens, Carlos Alzate, and Kristiaan Pelckmans, "Primal and dual model representations in kernel-based learning", Statistics Surveys 4 (2010): 148--183