Two-Sample Tests

Last update: 07 Jul 2025 12:01
First version: 3 September 2014

That is, statistical tests for whether two samples came from the same distribution.

(If you came here because a search engine directed you while you were looking for recipes on how to do specific test, like a \( t \) test, Mann-Whitney, etc., sorry.)

EunYi Chung, Joseph P. Romano, "Exact and asymptotically robust permutation tests", Annals of Statistics 41 (2013): 484--507, arxiv:1304.5939
Bharath K. Sriperumbudur, Kenji Fukumizu, Arthur Gretton, Bernhard Schölkopf, and Gert R. G. Lanckriet, "On the empirical estimation of integral probability metrics", Electronic Journal of Statistics 6 (2012): 1550--1599
Susan Wei, Chihoon Lee, Lindsay Wichers, Gen Li, J.S. Marron, "Direction-Projection-Permutation for High Dimensional Hypothesis Tests", arxiv:1304.0796
Makoto Yamada, Taiji Suzuki, Takafumi Kanamori, Hirotaka Hachiya, Masashi Sugiyama, "Relative Density-Ratio Estimation for Robust Distribution Comparison", Neural Computation 25 (2013): 1324--1370 [This is not the relative density between \( p \) and \( q \) in the Handcock-Morris sense, just the ratio between \( p \) and \( ap+(1-a)q \), for adjustable \( a \). (This is to keep the density ratio from going to infinite anywhere.) The thing seems a bit hackish, but still worth considering...]

Somayeh Danafar, Paola M.V. Rancoita, Tobias Glasmachers, Kevin Whittingstall, Juergen Schmidhuber, "Testing Hypotheses by Regularized Maximum Mean Discrepancy", arxiv:1305.0423
Subhra Sankar Dhar, Biman Chakraborty, Probal Chaudhuri, "Comparison of multivariate distributions using quantile-quantile plots and related tests", Bernoulli 20 (2014): 1484--1506, arxiv:1407.1212
Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, Alexander Smola, "A Kernel Two-Sample Test", Journal of Machine Learning Research 13 (2012): 723--773
Norbert Henze, "A Multivariate Two-Sample Test Based on the Number of Nearest Neighbor Type Coincidences", Annals of Statistics 16 (1988): 772--783
T. Kanamori, T. Suzuki and M. Sugiyama, "\( f \)-Divergence Estimation and Two-Sample Homogeneity Test Under Semiparametric Density-Ratio Models", IEEE Transactions on Information Theory 58 (2012): 708--720
David M. Ruth and Robert A. Koyak, "Nonparametric Tests for Homogeneity Based on Non-Bipartite Matching", Journal of the American Statistical Association 106 (2011): 1615--1625
Dino Sejdinovic, Bharath Sriperumbudur, Arthur Gretton, Kenji Fukumizu, "Equivalence of distance-based and RKHS-based statistics in hypothesis testing", Annals of Statistics 41 (2013): 2263--2291, arxiv:1207.6076
Nicolas Städler, Sach Mukherjee, "Two-Sample Testing in High-Dimensional Models", arxiv:1210.4584
Olivier Thas, Comparing Distributions [mostly about goodness-of-fit tests]
Mans Thulin, "A high-dimensional two-sample test for the mean using random subspaces", arxiv:1304.4564