Abstract
In order to avoid the curse of dimensionality, frequently encountered in big data analysis, there has been vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower-dimensional manifold; thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real-life applications, data is often very noisy. In this work, we propose a method to approximate \(\mathcal {M}\) a d-dimensional \(C^{m+1}\) smooth submanifold of \(\mathbb {R}^n\) (\(d \ll n\)) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located “near” the lower-dimensional manifold and suggest a nonlinear moving least-squares projection on an approximating d-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., \(\mathcal {O}(h^{m+1})\), where h is the fill distance and m is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension n. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high-dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.
Similar content being viewed by others
References
Aizenbud, Y., Averbuch, A.: Matrix decompositions using sub-gaussian random matrices. arXiv preprint arXiv:1602.03360 (2016)
Aizenbud, Y., Sober, B.: Approximating the span of principal components via iterative least-squares. arXiv preprint arXiv:1907.12159 (2019)
Alexa, M., Behr, J., Cohen-Or, D., Fleishman, S., Levin, D., Silva, C.T.: Computing and rendering point set surfaces. IEEE Trans. Vis. Comput. Graph. 9(1), 3–15 (2003)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Bellman, R.: Dynamic Programming, 1st edn. Princeton University Press, Princeton, NJ (1957)
Bishop, C.M., Svensén, M., Williams, C.K.I.: Gtm: a principled alternative to the self-organizing map. In: Artificial Neural Networks ICANN, vol. 96, pp. 165–170. Springer (1996)
Björck, A., Golub, G.H.: Numerical methods for computing angles between linear subspaces. Math. Comput. 27(123), 579–594 (1973)
Boissonnat, J.-D., Ghosh, A.: Manifold reconstruction using tangential Delaunay complexes. Discrete Comput. Geom. 51(1), 221–267 (2014)
Cheng, S.-W., Dey, T.K., Ramos, E.A.: Manifold reconstruction from point samples. SODA 5, 1018–1027 (2005)
Coifman, R.R., Lafon, S.: Diffusion maps. Appl. Comput. Harmon. Anal. 21(1), 5–30 (2006)
Demartines, P., Hérault, J.: Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets. IEEE Trans. Neural Netw. 8(1), 148–154 (1997)
Donoho, D.L., et al.: High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality. AMS Math Challenges Lecture, pp. 1–32. Citeseer (2000)
Federer, H.: Curvature measures. Trans. Am. Math. Soc. 93(3), 418–491 (1959)
Freedman, D.: Efficient simplicial reconstructions of manifolds from their samples. IEEE Trans. Pattern Anal. Mach. Intell. 24(10), 1349–1357 (2002)
Gong, D., Sha, F., Medioni, G.: Locally linear denoising on image manifolds. J. Mach. Learn. Res. JMLR 2010, 265 (2010)
Harris, P., Brunsdon, C., Charlton, M.: Geographically weighted principal components analysis. Int. J. Geogr. Inf. Sci. 25(10), 1717–1736 (2011)
Hein, M., Maier, M.: Manifold denoising. In: Advances in Neural Information Processing Systems, pp. 561–568 (2006)
Hinrichsen, D., Pritchard, A.J.: Mathematical Systems Theory I: Modelling, State Space Analysis, Stability and Robustness, vol. 48. Springer, Berlin (2005)
Hughes, G.: On the mean accuracy of statistical pattern recognizers. IEEE Trans. Inf. Theory 14(1), 55–63 (1968)
Jolliffe, I.: Principal Component Analysis. Wiley Online Library, Hoboken (2002)
Kohonen, T.: Self-organized formation of topologically correct feature maps. Biol. Cybern. 43(1), 59–69 (1982)
Kohonen, T.: Self-Organizing Maps, vol. 30. Springer Science & Business Media, Berlin (2001)
Lancaster, P., Salkauskas, K.: Surfaces generated by moving least squares methods. Math. Comput. 37(155), 141–158 (1981)
Lang, S.: Fundamentals of Differential Geometry, vol. 191. Springer Science & Business Media, Berlin (2012)
Lax, P.D.: Linear Algebra and Its Applications. A Wiley Series of Texts, Monographs and Tracts. Wiley, Pure and Applied Mathematics. Wiley (2013)
Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer Science & Business Media, Berlin (2007)
Levin, D.: The approximation power of moving least-squares. Math. Comput. Am. Math. Soc. 67(224), 1517–1531 (1998)
Levin, D.: Mesh-independent surface interpolation. In: Brunnett, G., Hamann, B., Müller, H., Linsen, L. (eds.) Geometric Modeling for Scientific Visualization, pp. 37–49. Springer (2004)
McLain, D.H.: Drawing contours from arbitrary data points. Comput. J. 17(4), 318–324 (1974)
Nash, J.: C1 isometric imbeddings. Ann. Math. 60, 383–396 (1954)
Nealen, A.: An as-short-as-possible introduction to the least squares, weighted least squares and moving least squares methods for scattered data approximation and interpolation, vol. 130, p. 150 (2004). http://www.nealen.com/projects
Rainer, A.: Perturbation theory for normal operators. Trans. Am. Math. Soc. 365(10), 5545–5577 (2013)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Saul, L.K., Roweis, S.T.: Think globally, fit locally: unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res. 4, 119–155 (2003)
Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10(5), 1299–1319 (1998)
Sober, B.: Structuring high dimensional data: a moving least-squares projective approach to analyze manifold data. Ph.D. thesis, School of Mathematical Sciences, Tel Aviv University, Tel Aviv (2018)
Sober, B., Aizenbud, Y., Levin, D.: Approximation of functions over manifolds: a moving least-squares approach. arXiv preprint arXiv:1711.00765 (2017)
Stewart, G.W., Sun, J.G.: Matrix Perturbation Theory. Computer science and scientific computing. Academic Press (1990)
Stewart, G.W.: Matrix Algorithms: Volume II: Eigensystems. SIAM, Philadelphia (2001)
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Torgerson, W.S.: Multidimensional scaling: I. Theory and method. Psychometrika 17(4), 401–419 (1952)
Von der Malsburg, C.: Self-organization of orientation sensitive cells in the striate cortex. Kybernetik 14(2), 85–100 (1973)
Weinberger, K.Q., Saul, L.K.: An introduction to nonlinear dimensionality reduction by maximum variance unfolding. In: AAAI, vol. 6, pp. 1683–1686 (2006)
Acknowledgements
The authors wish to thank the referees as well as the journal’s editor for their insightful remarks, which had an impact on the final version of paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Wolfgang Dahmen.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A - Geometrically Weighted PCA
Appendix A - Geometrically Weighted PCA
We wish to present here in our language the concept of geometrically weighted PCA (presented a bit differently in [16]), as this concept plays an important role in some of the lemmas proved in Sect. 4 and even in the algorithm itself.
Given a set of I vectors \(x_1 ,\ldots , x_I\) in \(\mathbb {R}^n\), we look for a Rank(d) projection \(P \in \mathbb {R}^{n \times n}\) that minimizes
If we denote by A the matrix whose i’th column is \(x_i\), then this is equivalent to minimizing
as the best possible Rank(d) approximation to the matrix A is the SVD Rank(d) truncation denoted by \(A_d\), we have:
And this projection yields:
which is the orthogonal projection of x onto \(\mathrm{span} \lbrace u_i \rbrace _{i=1}^d\). Here \(u_i\) represents the \(\hbox {i}{th}\) column of the matrix U.
Remark 5.1
The projection P is identically the projection induced by the PCA algorithm.
1.1 The Weighted Projection
In this case, given a set of n vectors \(x_1 ,\ldots , x_I\) in \(\mathbb {R}^n\), we look for a Rank(d) projection \(P \in \mathbb {R}^{n \times n}\) that minimizes
So if we define the matrix \(\tilde{A}\) such that the i’th column of \(\tilde{A}\) is the vector \(y_i = \sqrt{w_i}x_i\), then we get the projection
where \(\tilde{U}_d\) is the matrix containing the first d principal components of the matrix \(\tilde{A}\).
Rights and permissions
About this article
Cite this article
Sober, B., Levin, D. Manifold Approximation by Moving Least-Squares Projection (MMLS). Constr Approx 52, 433–478 (2020). https://doi.org/10.1007/s00365-019-09489-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00365-019-09489-8