Manifold Learning

Last update: 21 Apr 2025 21:17
First version: 25 September 2014

Suppose that we observe random vectors $X$ in some $p$-dimensional space, most of ordinary Euclidean space $\mathbb{R}^p$. Sometimes, the probability distribution of $X$ is supported on a big volume in $\mathbb{R}^p$, maybe even the whole space; this is the classical situation considered by multivariate statistics. Sometimes, however, the support is concentrated on, or at least near, some geometric structure with a dimension $q < p$. If the low-dimensional structure is a finite set of 0-dimensional points, we have clutsering. If the low-dimensional structure is a linear subspace, we have the situation dealt with by factor analysis. More interesting to me is the more general situation where the low-dimensional structure is a smooth but curved manifold. Then the goal of manifold learning is to try to reconstruct the underlying manifold from observations of $X$. We might also be content with knowing some geometric or topological properties of the manifold, like its dimension.

Mikhail Belkin and Partha Niyogi, "Laplacian Eigenmaps for Dimensionality Reduction and Data Representation", Neural Computation 15 (2003): 1373--1396
Ann B. Lee and Larry Wasserman, "Spectral Connectivity Analysis", Journal of the American Statistical Association 105 (2010): 1241--1255, arxiv:0811.0121
Elizaveta Levina and Peter J. Bickel, "Maximum Likelihood Estimation of Intrinsic Dimension", NIPS 2004
Sam T. Roweis and Laurence K. Saul, "Nonlinear Dimensionality Reduction by Locally Linear Embedding", Science 290 (2000): 2323--2326
Lawrence K. Saul and Sam T. Roweis, "Think Globally, Fit Locally: Supervised Learning of Low Dimensional Manifolds", Journal of Machine Learning Research 4 (2003): 119--155

My lectures 10--15 in my data mining class

Mridul Aanjaneya, Frédéric Chazal, Daniel Chen, Marc Glisse, Leonidas J. Guibas, Dmitriy Morozov, "Metric graph reconstruction from noisy data", inria-00630774
Evgeni Begelfor and Michael Werman, "The World is not Always Flat, or, Learning Curved Manifolds" [PDF]
Alexander V. Bernstein, Alexander P. Kuleshov, "Tangent Bundle Manifold Learning via Grassmann&Stiefel Eigenmaps", arxiv:1212.6031
Christopher J. C. Burges, "Dimension Reduction: A Guided Tour", Foundations and Trends in Machine Learning 2:4 (2010) [Preprint version]
Kevin M. Carter, Raviv Raich, William G. Finn, Alfred O. Hero, "FINE: Fisher Information Non-parametric Embedding", arxiv:0802.2050
Claudio Ceruti, Simone Bassis, Alessandro Rozza, Gabriele Lombardi, Elena Casiraghi, Paola Campadelli, "DANCo: Dimensionality from Angle and Norm Concentration", arxiv:1206.3881
Dong Chen, Hans-Georg Müller, "Nonlinear manifold representations for functional data", Annals of Statistics 40 (2012): 1--29, arxiv:1205.6040
Lisha Chen, Andreas Buja, "Stress Functions for Nonlinear Dimension Reduction, Proximity Analysis, and Graph Drawing", Journal of Machine Learning Research 14 (201): 1145--1173
Ming-yen Cheng and Hau-tieng Wu, "Local Linear Regression on Manifolds and Its Geometric Interpretation", Journal of the American Statistical Association 108 (2013): 1421--1434, arxiv:1201.0327
Andreas Damianou, Carl Ek, Michalis Titsias, Neil Lawrence, "Manifold Relevance Determination", ICML 2012, arxiv:1206.4610
Charles Fefferman, Sanjoy Mitter, Hariharan Narayanan, "Testing the Manifold Hypothesis", arxiv:1310.0425
Christopher Genovese, Marco Perone-Pacifico, Isabella Verdinelli, Larry Wasserman
- "Minimax Manifold Estimation", Journal of Machine Learning Research 13 (2012): 1263--1291, arxiv:1007.0549
- "Manifold estimation and singular deconvolution under Hausdorff loss", Annals of Statistics 40 (2012): 941--963, arxiv:1109.4540
Samuel Gerber and Ross Whitaker, "Regularization-Free Principal Curve Estimation", Journal of Machine Learning Research 14 (2013): 1285--1302
Evarist Giné and Vladimir Koltchinskii, "Empirical graph Laplacian approximation of Laplace--Beltrami operators: Large sample results", in High Dimensional Probability : Proceedings of the Fourth International Conference (Giné Koltchinskii, Li and Zinn, eds.), arxiv:math/0612777
Yair Goldberg, Alon Zakai, Dan Kushnir, Ya'acov Ritov, "Manifold Learning: The Price of Normalization", Journal of Machine Learning Research 9 (2008): 1909--1939
Dian Gong, Xuemei Zhao, Gerard Medioni, "Robust Multiple Manifolds Structure Learning", arxiv:1206.4624
John C. Gower and Jörg Blasius, "Multivariate Prediction with Nonlinear Principal Components Analysis"
- "Theory", Quality and Quantity 39 (2005): 359--372
- "Application", Quality and Quantity 39 (2005): 373--390
Reinhard Heckel, Helmut Bölcskei, "Robust Subspace Clustering via Thresholding", arxiv:1307.4891
Alan Julian Izenman, Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning
Daniel N. Kaslovsky, Francois G. Meyer, "Non-Asymptotic Analysis of Tangent Space Perturbation", arxiv:1111.4601
Brian Kulis, "Metric Learning: A Survey", Foundations and Trends in Machine Learning 5 (2013): 287--364
Neil D. Lawrence, "A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction: Insights and New Models", Journal of Machine Learning Research 13 (2012): 1609--1638
Sayan Mukherjee, Qiang Wu, Ding-Xuan Zhou, "Learning gradients on manifolds", Bernoulli 16 (2010): 181--207, arxiv:1002.4283
Partha Niyogi, "Manifold Regularization and Semi-supervised Learning: Some Theoretical Analyses", Journal of Machine Learning Research 14 (2013): 1229--1250
Partha Niyogi, Stephen Smale, and Shmuel Weinberger, "Finding the Homology of Submanifolds with High Confidence from Random Samples", Discrete and Computational Geometry OF1--OF23 (2006) [PDF reprint]
Dominique Perraul-Joncas, Marina Meila, "Non-linear dimensionality reduction: Riemannian metric estimation and the problem of geometric discovery", arxiv:1305.7255
Salah Rifai, Yoshua Bengio, Yann Dauphin, Pascal Vincent, "A Generative Process for Sampling Contractive Auto-Encoders", ICML 2012, arxiv:1206.6434
A. Rozza, G. Lombardi, C. Ceruti, E. Casiraghi, P. Campadelli, "Novel high intrinsic dimensionality estimators", Machine Learning 89 (2012): 37--65
Patrick T. Sadtler, Kristin M. Quick, Matthew D. Golub, Steven M. Chase, Stephen I. Ryu, Elizabeth C. Tyler-Kabara, Byron M. Yu and Aaron P. Batista, "Neural constraints on learning", Nature 512 (2014): 423--426
Amit Singer, Hau-tieng Wu, "Spectral Convergence of the Connection Laplacian from random samples", arxiv:1306.1587
Hiromichi Suetani, Karin Soejima, Rei Matsuoka, Ulrich Parlitz, and Hiroki Hata, "Manifold learning approach for chaos in the dripping faucet", Physical Review E 86 (2013): 036209
Ronen Talmon and Ronald R. Coifman, "Empirical intrinsic geometry for nonlinear modeling and time series filtering", Proceedings of the National Academy of Sciences (USA) 110 (2013): 12535--12540
Laurens van der Maaten, "Barnes-Hut-SNE", arxiv:1301.3342
Laurens van der Maaten and Geoffrey Hinton, "Visualizing Data using t-SNE", Journal of Machine Learning Research 9 (2008): 2579--2605 [SNE = "stochastic neighbor embedding", a manifold-learning technique]
Max Vladymyrov and Miguel Carreira-Perpinan, "Partial-Hessian Strategies for Fast Learning of Nonlinear Embeddings", arxiv:1206.4646
Xiaohui Wang, J. S. Marron, "A scale-based approach to finding effective dimensionality in manifold learning", Electronic Journal of Statistics 2 (2008): 127--148, arxiv:0710.5349
Zhenyue Zhang and Jing Wang, "MLLE: Modified Locally Linear Embedding Using Multiple Weights", NIPS 2006