Manifold Learning
27 Feb 2017 16:30
Suppose that we observe random vectors $X$ in some $p$-dimensional space, most of ordinary Euclidean space $\mathbb{R}^p$. Sometimes, the probability distribution of $X$ is supported on a big volume in $\mathbb{R}^p$, maybe even the whole space; this is the classical situation considered by multivariate statistics. Sometimes, however, the support is concentrated on, or at least near, some geometric structure with a dimension $q < p$. If the low-dimensional structure is a finite set of 0-dimensional points, we have clutsering. If the low-dimensional structure is a linear subspace, we have the situation dealt with by factor analysis. More interesting to me is the more general situation where the low-dimensional structure is a smooth but curved manifold. Then the goal of manifold learning is to try to reconstruct the underlying manifold from observations of $X$. We might also be content with knowing some geometric or topological properties of the manifold, like its dimension.
See also: Factor Models; Machine Learning, Statistical Inference, and Induction; State-Space Reconstruction; Statistics; Statistics on Manifolds
- Recommended:
- Mikhail Belkin and Partha Niyogi, "Laplacian Eigenmaps for Dimensionality Reduction and Data Representation", Neural Computation 15 (2003): 1373--1396
- Ann B. Lee and Larry Wasserman, "Spectral Connectivity Analysis", Journal of the American Statistical Association 105 (2010): 1241--1255, arxiv:0811.0121
- Elizaveta Levina and Peter J. Bickel, "Maximum Likelihood Estimation of Intrinsic Dimension", NIPS 2004
- Sam T. Roweis and Laurence K. Saul, "Nonlinear Dimensionality Reduction by Locally Linear Embedding", Science 290 (2000): 2323--2326
- Lawrence K. Saul and Sam T. Roweis, "Think Globally, Fit Locally: Supervised Learning of Low Dimensional Manifolds", Journal of Machine Learning Research 4 (2003): 119--155
- Modesty forbids me to recommend:
- My lectures 10--15 in my data mining class
- To read:
- Mridul Aanjaneya, Frédéric Chazal, Daniel Chen, Marc Glisse, Leonidas J. Guibas, Dmitriy Morozov, "Metric graph reconstruction from noisy data", inria-00630774
- Evgeni Begelfor and Michael Werman, "The World is not Always Flat, or, Learning Curved Manifolds" [PDF]
- Alexander V. Bernstein, Alexander P. Kuleshov, "Tangent Bundle Manifold Learning via Grassmann&Stiefel Eigenmaps", arxiv:1212.6031
- Christopher J. C. Burges, "Dimension Reduction: A Guided Tour", Foundations and Trends in Machine Learning 2:4 (2010) [Preprint version]
- Kevin M. Carter, Raviv Raich, William G. Finn, Alfred O. Hero, "FINE: Fisher Information Non-parametric Embedding", arxiv:0802.2050
- Claudio Ceruti, Simone Bassis, Alessandro Rozza, Gabriele Lombardi, Elena Casiraghi, Paola Campadelli, "DANCo: Dimensionality from Angle and Norm Concentration", arxiv:1206.3881
- Dong Chen, Hans-Georg Müller, "Nonlinear manifold representations for functional data", Annals of Statistics 40 (2012): 1--29, arxiv:1205.6040
- Lisha Chen, Andreas Buja, "Stress Functions for Nonlinear Dimension Reduction, Proximity Analysis, and Graph Drawing", Journal of Machine Learning Research 14 (201): 1145--1173
- Ming-yen Cheng and Hau-tieng Wu, "Local Linear Regression on Manifolds and Its Geometric Interpretation", Journal of the American Statistical Association 108 (2013): 1421--1434, arxiv:1201.0327
- Andreas Damianou, Carl Ek, Michalis Titsias, Neil Lawrence, "Manifold Relevance Determination", ICML 2012, arxiv:1206.4610
- Charles Fefferman, Sanjoy Mitter, Hariharan Narayanan, "Testing the Manifold Hypothesis", arxiv:1310.0425
- Christopher Genovese, Marco Perone-Pacifico, Isabella Verdinelli, Larry Wasserman
- "Minimax Manifold Estimation", Journal of Machine Learning Research 13 (2012): 1263--1291, arxiv:1007.0549
- "Manifold estimation and singular deconvolution under Hausdorff loss", Annals of Statistics 40 (2012): 941--963, arxiv:1109.4540
- Samuel Gerber and Ross Whitaker, "Regularization-Free Principal Curve Estimation", Journal of Machine Learning Research 14 (2013): 1285--1302
- Evarist Giné and Vladimir Koltchinskii, "Empirical graph Laplacian approximation of Laplace--Beltrami operators: Large sample results", in High Dimensional Probability : Proceedings of the Fourth International Conference (Giné Koltchinskii, Li and Zinn, eds.), arxiv:math/0612777
- Yair Goldberg, Alon Zakai, Dan Kushnir, Ya'acov Ritov, "Manifold Learning: The Price of Normalization", Journal of Machine Learning Research 9 (2008): 1909--1939
- Dian Gong, Xuemei Zhao, Gerard Medioni, "Robust Multiple Manifolds Structure Learning", arxiv:1206.4624
- John C. Gower and Jörg Blasius, "Multivariate Prediction with
Nonlinear Principal Components Analysis"
- "Theory", Quality and Quantity 39 (2005): 359--372
- "Application", Quality and Quantity 39 (2005): 373--390
- Reinhard Heckel, Helmut Bölcskei, "Robust Subspace Clustering via Thresholding", arxiv:1307.4891
- Alan Julian Izenman, Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning
- Daniel N. Kaslovsky, Francois G. Meyer, "Non-Asymptotic Analysis of Tangent Space Perturbation", arxiv:1111.4601
- Brian Kulis, "Metric Learning: A Survey", Foundations and Trends in Machine Learning 5 (2013): 287--364
- Neil D. Lawrence, "A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction: Insights and New Models", Journal of Machine Learning Research 13 (2012): 1609--1638
- Sayan Mukherjee, Qiang Wu, Ding-Xuan Zhou, "Learning gradients on manifolds", Bernoulli 16 (2010): 181--207, arxiv:1002.4283
- Partha Niyogi, "Manifold Regularization and Semi-supervised Learning: Some Theoretical Analyses", Journal of Machine Learning Research 14 (2013): 1229--1250
- Partha Niyogi, Stephen Smale, and Shmuel Weinberger, "Finding the Homology of Submanifolds with High Confidence from Random Samples", Discrete and Computational Geometry OF1--OF23 (2006) [PDF reprint]
- Dominique Perraul-Joncas, Marina Meila, "Non-linear dimensionality reduction: Riemannian metric estimation and the problem of geometric discovery", arxiv:1305.7255
- Salah Rifai, Yoshua Bengio, Yann Dauphin, Pascal Vincent, "A Generative Process for Sampling Contractive Auto-Encoders", ICML 2012, arxiv:1206.6434
- A. Rozza, G. Lombardi, C. Ceruti, E. Casiraghi, P. Campadelli, "Novel high intrinsic dimensionality estimators", Machine Learning 89 (2012): 37--65
- Patrick T. Sadtler, Kristin M. Quick, Matthew D. Golub, Steven M. Chase, Stephen I. Ryu, Elizabeth C. Tyler-Kabara, Byron M. Yu and Aaron P. Batista, "Neural constraints on learning", Nature 512 (2014): 423--426
- Amit Singer, Hau-tieng Wu, "Spectral Convergence of the Connection Laplacian from random samples", arxiv:1306.1587
- Hiromichi Suetani, Karin Soejima, Rei Matsuoka, Ulrich Parlitz, and Hiroki Hata, "Manifold learning approach for chaos in the dripping faucet", Physical Review E 86 (2013): 036209
- Ronen Talmon and Ronald R. Coifman, "Empirical intrinsic geometry for nonlinear modeling and time series filtering", Proceedings of the National Academy of Sciences (USA) 110 (2013): 12535--12540
- Laurens van der Maaten, "Barnes-Hut-SNE", arxiv:1301.3342
- Laurens van der Maaten and Geoffrey Hinton, "Visualizing Data using t-SNE", Journal of Machine Learning Research 9 (2008): 2579--2605 [SNE = "stochastic neighbor embedding", a manifold-learning technique]
- Max Vladymyrov and Miguel Carreira-Perpinan, "Partial-Hessian Strategies for Fast Learning of Nonlinear Embeddings", arxiv:1206.4646
- Xiaohui Wang, J. S. Marron, "A scale-based approach to finding effective dimensionality in manifold learning", Electronic Journal of Statistics 2 (2008): 127--148, arxiv:0710.5349
- Zhenyue Zhang and Jing Wang, "MLLE: Modified Locally Linear Embedding Using Multiple Weights", NIPS 2006