## Two-Sample Tests

*26 Oct 2019 14:14*

That is, statistical tests for whether two samples came from the same distribution.

(If you came here because a search engine directed you while you were looking for recipes on how to do specific test, like a \( t \) test, Mann-Whitney, etc., sorry.)

See also: Independent Tests, Conditional Independence Tests, Measures of Dependence and Conditional Dependence

- Recommended (totally inadequate):
- EunYi Chung, Joseph P. Romano, "Exact and asymptotically robust permutation tests", Annals of Statistics
**41**(2013): 484--507, arxiv:1304.5939 - Bharath K. Sriperumbudur, Kenji Fukumizu, Arthur Gretton, Bernhard Schölkopf, and Gert R. G. Lanckriet, "On the empirical estimation of integral probability metrics", Electronic Journal of Statistics
**6**(2012): 1550--1599 - Susan Wei, Chihoon Lee, Lindsay Wichers, Gen Li, J.S. Marron, "Direction-Projection-Permutation for High Dimensional Hypothesis Tests", arxiv:1304.0796
- Makoto Yamada, Taiji Suzuki, Takafumi Kanamori, Hirotaka Hachiya,
Masashi Sugiyama, "Relative Density-Ratio Estimation for Robust Distribution Comparison",
Neural Computation
**25**(2013): 1324--1370 [This is*not*the relative density between \( p \) and \( q \) in the Handcock-Morris sense, just the ratio between \( p \) and \( ap+(1-a)q \), for adjustable \( a \). (This is to keep the density ratio from going to infinite anywhere.) The thing seems a bit hackish, but still worth considering...]

- To read:
- Somayeh Danafar, Paola M.V. Rancoita, Tobias Glasmachers, Kevin Whittingstall, Juergen Schmidhuber, "Testing Hypotheses by Regularized Maximum Mean Discrepancy", arxiv:1305.0423
- Subhra Sankar Dhar, Biman Chakraborty, Probal Chaudhuri, "Comparison of multivariate distributions using quantile-quantile plots and related tests",
Bernoulli
**20**(2014): 1484--1506, arxiv:1407.1212 - Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, Alexander Smola, "A Kernel Two-Sample Test", Journal of Machine Learning Research
**13**(2012): 723--773 - Norbert Henze, "A Multivariate Two-Sample Test Based on the Number of Nearest Neighbor Type Coincidences", Annals of Statistics
**16**(1988): 772--783 - T. Kanamori, T. Suzuki and M. Sugiyama, "\( f \)-Divergence Estimation and Two-Sample Homogeneity Test Under Semiparametric Density-Ratio Models",
IEEE Transactions on Information Theory
**58**(2012): 708--720 - David M. Ruth and Robert A. Koyak, "Nonparametric Tests for Homogeneity Based on Non-Bipartite Matching", Journal of the American Statistical Association
**106**(2011): 1615--1625 - Dino Sejdinovic, Bharath Sriperumbudur, Arthur Gretton, Kenji Fukumizu, "Equivalence of distance-based and RKHS-based statistics in hypothesis testing",
Annals of Statistics
**41**(2013): 2263--2291, arxiv:1207.6076 - Nicolas Städler, Sach Mukherjee, "Two-Sample Testing in High-Dimensional Models", arxiv:1210.4584
- Olivier Thas, Comparing Distributions [mostly about goodness-of-fit tests]
- Mans Thulin, "A high-dimensional two-sample test for the mean using random subspaces", arxiv:1304.4564