A Riemannian geometric framework for manifold learning of non-Euclidean data

Jang, Cheongjae; Noh, Yung-Kyun; Park, Frank Chongwoo

doi:10.1007/s11634-020-00426-3

A Riemannian geometric framework for manifold learning of non-Euclidean data

Regular Article
Published: 27 November 2020

Volume 15, pages 673–699, (2021)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

1215 Accesses
2 Citations
Explore all metrics

Abstract

A growing number of problems in data analysis and classification involve data that are non-Euclidean. For such problems, a naive application of vector space analysis algorithms will produce results that depend on the choice of local coordinates used to parametrize the data. At the same time, many data analysis and classification problems eventually reduce to an optimization, in which the criteria being minimized can be interpreted as the distortion associated with a mapping between two curved spaces. Exploiting this distortion minimizing perspective, we first show that manifold learning problems involving non-Euclidean data can be naturally framed as seeking a mapping between two Riemannian manifolds that is closest to being an isometry. A family of coordinate-invariant first-order distortion measures is then proposed that measure the proximity of the mapping to an isometry, and applied to manifold learning for non-Euclidean data sets. Case studies ranging from synthetic data to human mass-shape data demonstrate the many performance advantages of our Riemannian distortion minimization framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Siamese Neural Networks: An Overview

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

Tutorial on PCA and approximate PCA and approximate kernel PCA

Article Open access 31 October 2022

Notes

A useful analogy is the problem of making two-dimensional Cartesian maps of the earth: given a set of data points sampled from the earth’s surface, a two-dimensional surface—in this case a sphere—is first fitted to these points, and a Cartesian map of the sphere that best preserves distances and angles is then sought.
Recall the spectral norm of a square matrix A is the positive square root of the maximum eigenvalue of $A^{\top } A$. It can also be verified that if $\lambda _i$ is an eigenvalue of $J^{\top } H J G^{-1}$, then $\lambda _i-1$ is an eigenvalue of $J^{\top } H J G^{-1}-I$.
The kernel function defined on Riemannian manifolds as in (11) is known to not be positive-definite in general (Jayasumana et al. 2015; Feragen et al. 2015). However, for our manifold learning purposes that mainly target to capture only the submanifold on which the data points lie, we do not require the positive-definiteness of the kernel.
Such a choice for weights is based on the approximation ${\tilde{d}}_i \approx c'\frac{\sqrt{\det G}}{\rho }(x_i)$ for a constant $c'>0$, where $\rho : \mathcal {M} \rightarrow {\mathbb {R}}$ is the underlying probability density generating data $x_i$, satisfying $\rho (x)\ge 0$ for all $x \in {\mathbb {R}}^m$ and $\int _\mathcal {M} \rho (x) \ dx = 1$. We refer the reader to equation (A.1.27) in Appendix A.1 of Jang (2019) for this approximation.

References

Barahona S, Gual-Arnau X, Ibáñez MV, Simó A (2018) Unsupervised classification of children’s bodies using currents. Adv Data Anal Classif 12(2):365–397
Article MathSciNet Google Scholar
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396
Article Google Scholar
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
MathSciNet MATH Google Scholar
Boothby WM (1986) An introduction to differentiable manifolds and Riemannian geometry, vol 120. Academic press, Cambridge
MATH Google Scholar
Bronstein MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond euclidean data. IEEE Signal Process Magazine 34(4):18–42
Article Google Scholar
Coifman RR, Lafon S (2006) Diffusion maps. Appl Comput Harmonic Anal 21(1):5–30
Article MathSciNet Google Scholar
Desbrun M, Meyer M, Alliez P (2002) Intrinsic parameterizations of surface meshes. Comput Graph Forum Wiley Online Libr 21:209–218
Article Google Scholar
Donoho DL, Grimes C (2003) Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc Natl Acad Sci 100(10):5591–5596
Article MathSciNet Google Scholar
Dubrovin BA, Fomenko AT, Novikov SP (1992) Modern geometry-methods and applications Part I. The geometry of surfaces, transformation groups, and fields. Springer, Berlin
MATH Google Scholar
Eells J, Lemaire L (1978) A report on harmonic maps. Bull London Math Soc 10(1):1–68
Article MathSciNet Google Scholar
Eells J, Lemaire L (1988) Another report on harmonic maps. Bull London Math Soc 20(5):385–524
Article MathSciNet Google Scholar
Eells J, Sampson JH (1964) Harmonic mappings of Riemannian manifolds. Am J Math 86(1):109–160
Article MathSciNet Google Scholar
Feragen A, Lauze F, Hauberg S (2015) Geodesic exponential kernels: When curvature and linearity conflict. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3032–3042
Fletcher PT, Joshi S (2007) Riemannian geometry for the statistical analysis of diffusion tensor data. Signal Process 87(2):250–262
Article Google Scholar
Goldberg Y, Zakai A, Kushnir D, Ritov Y (2008) Manifold learning: the price of normalization. J Mach Learn Res 9:1909–1939
MathSciNet MATH Google Scholar
Gu X, Wang Y, Chan TF, Thompson PM, Yau ST (2004) Genus zero surface conformal mapping and its application to brain surface mapping. IEEE Trans Med Imag 23(8):949–958
Article Google Scholar
Jang C (2019) Riemannian distortion measures for non-euclidean data. Ph.D. thesis, Seoul National University
Jayasumana S, Hartley R, Salzmann M, Li H, Harandi M (2015) Kernel methods on riemannian manifolds with gaussian rbf kernels. IEEE Trans Pattern Anal Mach Intell 37(12):2464–2477
Article Google Scholar
Lafon SS (2004) Diffusion maps and geometric harmonics. PhD thesis, Yale University Ph.D dissertation
Lee T, Park FC (2018) A geometric algorithm for robust multibody inertial parameter identification. IEEE Robot Autom Lett 3(3):2455–2462
Article Google Scholar
Lin B, He X, Ye J (2015) A geometric viewpoint of manifold learning. Appl Inform 2:3. https://doi.org/10.1186/s40535-015-0006-6
Article Google Scholar
McQueen J, Meila M, Perrault-Joncas D (2016) Nearly isometric embedding by relaxation. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R (eds) Advances in Neural Information Processing Systems, pp 2631–2639
Mullen P, Tong Y, Alliez P, Desbrun M (2008) Spectral conformal parameterization. Comput Graph Forum Wiley Online Libr 27:1487–1494
Article Google Scholar
Park FC, Brockett RW (1994) Kinematic dexterity of robotic mechanisms. Int J Robot Res 13(1):1–15
Article Google Scholar
Pelletier B (2005) Kernel density estimation on riemannian manifolds. Stat Probab Lett 73(3):297–304
Article MathSciNet Google Scholar
Perrault-Joncas D, Meila M (2013) Non-linear dimensionality reduction: Riemannian metric estimation and the problem of geometric discovery. arXiv preprint arXiv:1305.7255
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Article Google Scholar
Steinke F, Hein M, Schölkopf B (2010) Nonparametric regression between general riemannian manifolds. SIAM J Imag Sci 3(3):527–563
Article MathSciNet Google Scholar
Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Article Google Scholar
Vinué G, Simó A, Alemany S (2016) The $k$-means algorithm for 3d shapes with an application to apparel design. Adv Data Anal Classif 10(1):103–132
Article MathSciNet Google Scholar
Wensing PM, Kim S, Slotine JJE (2018) Linear matrix inequalities for physically consistent inertial parameter identification: a statistical perspective on the mass distribution. IEEE Robot Autom Lett 3(1):60–67
Article Google Scholar
Yang Y, Yu Y, Zhou Y, Du S, Davis J, Yang R (2014) Semantic parametric reshaping of human body models. In: 3D Vision (3DV), 2014 2nd International Conference on, IEEE, vol 2, pp 41–48
Zhang T, Li X, Tao D, Yang J (2008) Local coordinates alignment (lca): a novel manifold learning approach. Int J Pattern Recogn Artif Intell 22(04):667–690
Article Google Scholar
Zhang Z, Zha H (2004) Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J Sci Comput 26(1):313–338
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mechanical Engineering, Seoul National University, Seoul, Korea
Cheongjae Jang & Frank Chongwoo Park
Department of Computer Science, Hanyang University, Seoul, Korea
Yung-Kyun Noh

Authors

Cheongjae Jang
View author publications
You can also search for this author in PubMed Google Scholar
Yung-Kyun Noh
View author publications
You can also search for this author in PubMed Google Scholar
Frank Chongwoo Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frank Chongwoo Park.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Cheongjae Jang and Frank Chongwoo Park were supported in part by the NAVER LABS’ AMBIDEX Project, MSIT-IITP (2019-0-01367, BabyMind), SNU-IAMD, SNU BK21+ Program in Mechanical Engineering, SNU Institute for Engineering Research, the National Research Foundation of Korea (NRF-2016R1A5A1938472), the Technology Innovation Program (ATC+, 20008547) funded by the Ministry of Trade, Industry, and Energy (MOTIE, Korea), and SNU BMRR Grant DAPAUD190018ID. Yung-Kyun Noh was supported by Samsung Research Funding & Incubation Center of Samsung Electronics under Project Number SRFC-IT1901-13 and by Hanyang University (HY-2019). (Corresponding author: Frank Chongwoo Park.).

Appendices

Appendix

A: Further mathematical details of manifold learning algorithms

1.1 A.1 Proof of Proposition 1

Proof

The inverse metric $JG^{-1}J^{\top }$ at $x = x_i$ is obtained in (14) as

$$\begin{aligned} J G^{-1} J^{\top }(x_i) = \frac{1}{2} Y (\text {diag}(L_i) - e_i e_i^{\top } L - L^{\top } e_i e_i^{\top }) Y^{\top }, \end{aligned}$$

where $Y = \begin{bmatrix} y_1, \ldots , y_N \end{bmatrix} \in {\mathbb {R}}^{n\times N}$ is the matrix representation of the embeddings, $L = \frac{1}{ch}( {\tilde{D}}^{-1}{\tilde{K}} - I) \in {\mathbb {R}}^{N\times N}$ is the normalized graph Laplacian (${\tilde{D}}, {\tilde{K}}\in {\mathbb {R}}^{N\times N}$ are obtained from Algorithm 1 and both $c,h>0$), $L_i \in {\mathbb {R}}^N$ is the i-th row of L, and $e_i = (0,\ldots ,1,\ldots ,0) \in {\mathbb {R}}^N$ is a standard basis vector whose i-th component is one.

To see if $J G^{-1} J^{\top }(x_i)$ is positive semi-definite, it suffices to see if

$$\begin{aligned} M_i \equiv c \ h \left( \text {diag}(L_i) - e_i e_i^{\top } L - L^{\top } e_i e_i^{\top } \right) \in {\mathbb {R}}^{N\times N} \end{aligned}$$

(20)

is positive semi-definite. For any $v = (v_1, \ldots , v_N) \in {\mathbb {R}}^N$,

$$\begin{aligned} v^\top M_i v&= c \ h \left( \sum _{j,k=1}^N \left( \text {diag}(L_i) - e_i e_i^{\top } L - L^{\top } e_i e_i^{\top } \right) _{jk} v_j v_k \right) \end{aligned}$$

(21)

$$\begin{aligned}&= c \ h \left( \sum _{j=1}^N L_{ij} v_j^2 - 2 L_{ij} v_i v_j \right) \end{aligned}$$

(22)

$$\begin{aligned}&= v_i^2 + \sum _{j=1}^N ({\tilde{D}}_{ii})^{-1} {\tilde{K}}_{ij} (v_j^2 - 2v_i v_j) \end{aligned}$$

(23)

$$\begin{aligned}&= \sum _{j=1}^N ({\tilde{D}}_{ii})^{-1} {\tilde{K}}_{ij} (v_i - v_j)^2 \ge 0, \end{aligned}$$

(24)

where $L_{ij}$ denotes the (i, j) entry of L in (22). In deriving (23)-(24), we use the equalities $L_{ij} = \frac{1}{ch}(({\tilde{D}}_{ii})^{-1} {\tilde{K}}_{ij} - \delta _{ij})$ ($\delta _{ij} = 1$ if $i=j$ and 0 otherwise) and $\sum _{j=1}^N ({\tilde{D}}_{ii})^{-1} {\tilde{K}}_{ij} = 1$, and also the inequality ${\tilde{K}}_{ij} \ge 0$ for $i,j = 1,\ldots ,N$. Since the inequality $v^\top M_i v \ge 0$ holds for any $v \in {\mathbb {R}}^N$, $M_i$ is positive semi-definite; then $JG^{-1} J^\top (x_i) = \frac{1}{2ch} Y M_i Y^\top $ also becomes positive semi-definite for all $i = 1, \ldots , N$. $\square $

1.2 A.2 Riemannian relaxation

In the Riemannian relaxation method of McQueen et al. (2016), ${{\mathcal {M}}}$ is chosen to be an m-dimensional submanifold of Euclidean ambient space ${\mathbb {R}}^D$, with Riemannian metric G corresponding to the Euclidean metric on ${\mathbb {R}}^D$ projected to ${{\mathcal {M}}}$. The target manifold ${{\mathcal {N}}}$ is set to be ${\mathbb {R}}^n$ for some a priori chosen dimension $n \ge \text{ dim }({{\mathcal {M}}})$; the Riemannian metric on ${{\mathcal {N}}}$ is set to $H=I$.

Given Euclidean data points $u_i \in {\mathbb {R}}^D$, $i = 1, \ldots , N$ ($x_i \in {\mathbb {R}}^m$ in local coordinates), denote their n-dimensional embeddings by $y_i \in {\mathbb {R}}^n$. The embedding is then obtained as the solution to the following optimization:

$$\begin{aligned} \min _{y_i} \sum _{i=1}^N \Vert JG^{-1}J^{\top }(u_i) - I\Vert ^2 \alpha _i, \end{aligned}$$

(25)

where $JG^{-1}J^{\top }(u_i)$ denotes the $JG^{-1}J^{\top }$ estimated on $u_i$ using the method presented in Perrault-Joncas and Meila (2013), $\Vert \cdot \Vert $ denotes the matrix spectral norm, and $\alpha _i$ are weights. If $n > \text {dim}(\mathcal {M})$, I in (25) is replaced by $R_m R_m^{\top }$, where $R_m = [r_1, \ldots , r_m] \in {\mathbb {R}}^{n\times m}$ with $r_i \in {\mathbb {R}}^n$ the i-th singular vector of $JG^{-1}J^{\top }$.

From the perspective of our Riemannian distortion framework, assuming the rank of $J G^{-1} J^{\top }$ is m and the weights $\alpha _i$ in (25) are set to ${\tilde{d}}_{i}$ (obtained from Algorithm 1), the objective function in (25) can be expressed as

$$\begin{aligned} \min _{f} \mathcal {D}(f) = \int _{{\mathcal {M}}} \max _{i} (\lambda _i-1)^2 \sqrt{\det G} \, dx^1 \cdots dx^m, \end{aligned}$$

(26)

where the $\lambda _i$ are the m nonzero eigenvalues of $J G^{-1} J^{\top }$, which are identical to those of $J^{\top } J G^{-1}$. Since in practice the numerical estimation of $J G^{-1} J^{\top } \in {\mathbb {R}}^{n\times n}$ may yield a rank higher than m when $n > \text{ dim }({{\mathcal {M}}})$, one solution is to impose a soft constraint on the rank of $J G^{-1} J^{\top }$, e.g., in McQueen et al. (2016) the optimization is formulated as

$$\begin{aligned} \min _{f} \mathcal {D}(f) = \int _{{\mathcal {M}}} \max \left( \max _{i \in \mathcal {I}_m} (\lambda _i-1)^2, \ \max _{i \not \in \mathcal {I}_m} \left( \frac{\lambda _i}{\epsilon }\right) ^2\right) \sqrt{\det G} \, dx^1 \cdots dx^m, \end{aligned}$$

(27)

where $\lambda _i$ are the eigenvalues of $J G^{-1} J^{\top }$, $\mathcal {I}_m$ denotes the set of indices of the m largest eigenvalues, and $\epsilon >0$ is a scalar parameter intended to suppress the smaller $(n-m)$ eigenvalues.

1.3 A.3 Proof of Proposition 2

Proof

For $H = I$, the discretized formulation of the harmonic mapping distortion in the form of (15) is obtained as follows:

$$\begin{aligned} {{\mathcal {D}}}(Y)= & {} \sum _{i=1}^N \text {Tr}(JG^{-1}J^{\top }(x_i)) \ {\tilde{d}}_i \end{aligned}$$

(28)

$$\begin{aligned}= & {} \frac{1}{2} \sum _{i=1}^N \text {Tr} \left( Y(\text {diag} (L_i) - e_i e_i^{\top } L - L^{\top } e_i e_i^{\top })Y^{\top } \right) {\tilde{d}}_i \end{aligned}$$

(29)

$$\begin{aligned}= & {} \frac{1}{2}\text {Tr} \left( Y(\text {diag} ( \mathbb {1}_N^{\top } {\tilde{D}} L) - {\tilde{D}} L - L^{\top } {\tilde{D}} ) Y^{\top } \right) \end{aligned}$$

(30)

$$\begin{aligned}= & {} \frac{1}{c \ h} \text {Tr}(Y({\tilde{D}}-{\tilde{K}})Y^{\top }), \end{aligned}$$

(31)

where ${\tilde{K}}, {\tilde{d}}_i, {\tilde{D}}$ are obtained from Algorithm 1, $L = \frac{1}{c \ h} ({\tilde{D}}^{-1}{\tilde{K}} - I)\in {\mathbb {R}}^{N \times N}$ is the graph Laplacian from Algorithm 1, $L_i$ is the i-th row of L, $e_i = (0,\ldots ,1,\ldots ,0) \in {\mathbb {R}}^N$ is a standard basis vector whose i-th component is one, and $\mathbb {1}_N \in {\mathbb {R}}^N$ denotes an N-dimensional vector whose components are all one. In deriving (28)–(31), the estimate of $JG^{-1}J^{\top }$ at $x_i$ in (14), and the equalities $\text {Tr}(J^{\top } HJG^{-1}) = \text {Tr}(JG^{-1}J^{\top }H)$ and $\mathbb {1}_N^{\top } {\tilde{D}} L = 0$ are used.

Given a constant matrix $Y_b$ specified by the boundary condition, minimizing (31) for $Y_r$ reduces to

$$\begin{aligned} \min _{Y_r} \text{ Tr }(Y({\tilde{D}}-{\tilde{K}})Y^{\top }) = \text{ Tr }(Y_{b}({\tilde{D}}_{bb}-{\tilde{K}}_{bb})Y_{b}^{\top } - 2Y_{b}{\tilde{K}}_{br}Y_r^{\top } + Y_r({\tilde{D}}_{rr}-{\tilde{K}}_{rr})Y_r^{\top }). \end{aligned}$$

(32)

A closed-form solution for $Y_r$ is obtained as

$$\begin{aligned} Y_r = Y_b {\tilde{K}}_{br} ({\tilde{D}}_{rr}-{\tilde{K}}_{rr})^{-1} = Y_b W, \end{aligned}$$

(33)

where $W = {\tilde{K}}_{br} ({\tilde{D}}_{rr}-{\tilde{K}}_{rr})^{-1} \in {\mathbb {R}}^{N_b \times N_r}$.

Assume that ${\tilde{K}}_{ij} = {\tilde{K}}_{ji} \ge 0$ for all $i, j = 1, \ldots , N$, a graph with ${\tilde{K}}_{rr}$ as its adjacency matrix is connected, and ${\tilde{K}}_{br}$ is not a zero matrix. Then the matrix $({\tilde{D}}_{rr}-{\tilde{K}}_{rr})$ becomes positive-definite, so that W always exists. The positive-definiteness of $({\tilde{D}}_{rr}-{\tilde{K}}_{rr})$ can be shown from the following inequality: for any $v = (v_1, \ldots , v_{N_r})\ne 0 \in {\mathbb {R}}^{N_r}$,

$$\begin{aligned} v^\top ({\tilde{D}}_{rr}-{\tilde{K}}_{rr}) v&= \sum _{i,j = 1}^{N_r} ({\tilde{D}}_{rr}-{\tilde{K}}_{rr})_{ij} v_i v_j \end{aligned}$$

(34)

$$\begin{aligned}&= \sum _i = 1=^{N_r} ({\tilde{D}}_{rr})_{ii} v_i^2 - \sum _{i,j = 1}^{N_r} ({\tilde{K}}_{rr})_{ij} v_i v_j \end{aligned}$$

(35)

$$\begin{aligned}&= \sum _{i = 1}^{N_r} \left( \sum _{k = 1}^{N_b} ({\tilde{K}}_{br})_{ki} \right) v_i^2 + \frac{1}{2}\sum _{i,j = 1}^{N_r} ({\tilde{K}}_{rr})_{ij} (v_i - v_j)^2 > 0, \end{aligned}$$

(36)

where we use the fact that $({\tilde{D}}_{rr})_{ii} = \sum _{k=1}^{N_b} ({\tilde{K}}_{br})_{ki} + \sum _{k = 1}^{N_r} ({\tilde{K}}_{rr})_{ki}$ in deriving (36). From the direct application of Cramer’s rule, it can be shown that each entry of $({\tilde{D}}_{rr} - {\tilde{K}}_{rr})^{-1}$ is non-negative. Since every entry of ${\tilde{K}}_{br}$ is non-negative, all the entries of W are also non-negative. Furthermore, W satisfies the equation $\mathbb {1}_{N_r}^{\top } = \mathbb {1}_{N_b}^{\top } W$ from the equality ${\tilde{D}}_{rr} \mathbb {1}_{N_r} = {\tilde{K}}_{rr} \mathbb {1}_{N_r} + {\tilde{K}}_{br}^{\top } \mathbb {1}_{N_b}$; hence the entries of each column of W sum to one. $\square $

B: Experimental details for Section 4

1.1 B.1 Swiss roll

Here we explain further experimental details for the case study performed in Sect. 4.1. The data points are non-uniformly sampled; referring to the unfolded manifold in Fig. 3a, the density is set to oscillate along the horizontal axis, while uniform along the vertical axis. When choosing the initial parameter value $\theta _0$ for Algorithm 2, locality-preserving embeddings are preferable. As a choice for such an initial parameter value, we use two-dimensional embedding obtained from the Isomap (Tenenbaum et al. 2000). Any other embeddings that preserve locality can also be used as an initial guess, e.g., those from locally linear embedding (LLE; Roweis and Saul 2000), Laplacian eigenmap (LE; Belkin and Niyogi 2003), diffusion map (DM; Coifman and Lafon 2006), Hessian eigenmap (HLLE; Donoho and Grimes 2003), or local tangent space alignment (LTSA; Zhang and Zha 2004).

For the embedding obtained from the Isomap method, we test its five different scalings as the initial parameter value $\theta _0$ for Algorithm 3; we then choose the output embeddings that show the best match to the pairwise distances between ten nearest neighbors in the ambient space. Also note that the kernel bandwidth parameter h for the approximation of the graph Laplacian in Algorithm 2 is chosen to have the same order to the averaged nearest neighbor distance from each of the data points according to Lafon (2004).

1.2 B.2 Synthetic P(2) data

1.2.1 B.2.1 Details for the submanifold considered in Section 4.2

The tangent space of $\text {P(n)}$ at any $P \in \text {P(n)}$ can be identified with $\text {S(n)}$, the space of $n \times n$ symmetric matrices. Given $X, Y \in \text {S(n)}$, the affine-invariant Riemannian metric at P is defined by the inner product

$$\begin{aligned} \langle X, Y \rangle _{P} = \text {Tr}(P^{-1} X P^{-1} Y). \end{aligned}$$

(37)

Consider the following orthogonal decomposition of $P \in \text {P(2)}$:

$$\begin{aligned} P = RSR^{\top }, \end{aligned}$$

(38)

where $R = \left[ \begin{array}{rr} \cos \theta &{} \quad -\sin \theta \\ \sin \theta &{} \quad \cos \theta \end{array} \right] \in \text {SO(2)}$ with $\theta \in [0, \frac{\pi }{2})$, and $S = \text {diag}(e^{p}, e^{q})$ for scalar p, q. A local coordinate chart can be defined in terms of $(p,q,\theta )$ on an open subset $U = \{P \in \text {P(2)} \, | \, P \ne c I \ \ \text {for} \ \ c > 0\}$. The affine-invariant Riemannian metric in (37) is then represented in $(p,q,\theta )$-coordinates (at $p \ne q$) as

$$\begin{aligned} ds^2 = \text {Tr}((P^{-1}dP)^2) = {dp}^2 + {dq}^2 + 2\left( e^{p-q} + e^{q-p} - 2\right) d\theta ^2. \end{aligned}$$

(39)

For the case study in Sect. 4.2, the data set shown in Fig. 4a is generated by joining two cylinders (with a hole) $\mathcal {C}_1$ and $\mathcal {C}_2$ in $(p, q, \theta )$-coordinates, where $\mathcal {C}_1 = \{(p, q, \theta ) \ | \ p = \sin \theta _S, \; q = -1 + \cos \theta _S, \; \theta _S \in \left[ 0, \frac{4}{3}\pi \right] , \; \theta \in \left[ 0, \frac{\pi }{4} \right] \}$ and $\mathcal {C}_2 = \{(p, q, \theta ) \ | \ p = \sin \theta _S, \; q = 1 - \cos \theta _S, \; \theta _S \in \left[ -\frac{4}{3}\pi , 0 \right] , \; \theta \in \left[ 0, \frac{\pi }{4} \right] \}$ (see Fig. 4a; the backbone curve in the figure corresponds to the direction along which $\theta _S$ varies). The affine-invariant Riemannian metric on this submanifold (at $\theta _S \ne 0$) is obtained in terms of coordinates $(\theta , \theta _S)$, $\theta _S \ne 0$, as

$$\begin{aligned} ds^2 = d\theta _S^2 + 2\left( e^{1 + \sqrt{2}\sin (|\theta _S| - \frac{\pi }{4})} + e^{-1 - \sqrt{2}\sin (|\theta _S| - \frac{\pi }{4})} - 2\right) d\theta ^2. \end{aligned}$$

(40)

Because of the nonzero Riemannian curvature of this submanifold, isometric embeddings in two-dimensional Euclidean space do not exist.

1.2.2 B.2.2 Evaluation of the pairwise distance and tangent vector angle errors

For data points $x_i \in \text {P(2)}$ and corresponding embeddings $y_i \in {\mathbb {R}}^2$, $i = 1, \ldots , N$, the pairwise distance error for k nearest points is defined as

$$\begin{aligned} \text {Pairwise distance error} = \frac{1}{k N}\sum _{i=1}^N\sum _{j\in \{\text {NN}_k(i)\}}\left( \Vert y_i-y_j\Vert - \text {dist}(x_i, x_j)\right) ^2, \end{aligned}$$

(41)

where $y_i$ denotes the optimal embedding of $x_i$, $\text {NN}_{k}(i)$ denotes the set of indices of k nearest neighbor points to $x_i$, and $\text {dist}(x_i,x_j)$ denotes the ground truth distance between $x_i$ and $x_j$, i.e., the geodesic distance measured on the submanifold. To measure angles, the tangent vectors are approximated by the difference between nearest neighbors. The tangent vector angle error is defined as

$$\begin{aligned}&\text {Tangent vector angle error} \nonumber \\&\quad = \frac{2}{k(k-1)N}\sum _{l=1}^N\sum _{i,j \in \{\text {NN}_{k}(l)\}} \left( \text {acos}(\langle v_i, v_j\rangle ) - \text {acos}(\langle V_i, V_j\rangle )\right) ^2, \end{aligned}$$

(42)

where $v_i, V_i$ respectively denote the tangent vector from the l-th data point to the i-th data point in the optimal embeddings and the original data points, and $\langle \cdot , \cdot \rangle $ denotes the inner product.

When reporting the final manifold learning results in Table 1, for the reference values to evaluate the pairwise distance error, we numerically obtain the minimal geodesic distances on the submanifolds. Also, the inner product in (37) is used to calculate the reference values for the angles between tangent vectors.

1.3 B.3 Human mass-inertia data

1.3.1 B.3.1 Synthesizing human mass-inertia data

Since mass-inertial parameter data for humans are not readily available, we use human shape data from Yang et al. (2014) to synthesize this data set; specifically, assuming uniform mass density, we integrate the volumes of the human body shapes to construct mass-inertial parameter data for the corresponding $N_l=10$ links.

1.3.2 B.3.2 Further principal components of human mass-inertia data

As a supplement of Fig. 5 in Sect. 4.3, here we provide the third and fourth principal components of the human mass-inertia data obtained from both principal geodesic analysis (PGA) and vector space principal component analysis (PCA). The variations corresponding to the third and fourth principal components of PGA are shown in Fig. 8a–b. Principal component 3 captures variations in the height and torso thickness, and principal component 4 captures variations mainly in the height.

In the case of vector space PCA shown in Fig. 8c–d, the variations near the mean are qualitatively similar to those obtained for PGA. However, the positive-definiteness requirement is violated even for data points just 0.5 standard deviations away from the mean. The ellipsoids for those inertial parameters collapse and can be observed in the dashed red ellipses of Fig. 8c–d.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jang, C., Noh, YK. & Park, F.C. A Riemannian geometric framework for manifold learning of non-Euclidean data. Adv Data Anal Classif 15, 673–699 (2021). https://doi.org/10.1007/s11634-020-00426-3

Download citation

Received: 05 May 2020
Revised: 17 September 2020
Accepted: 04 November 2020
Published: 27 November 2020
Issue Date: September 2021
DOI: https://doi.org/10.1007/s11634-020-00426-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Riemannian geometric framework for manifold learning of non-Euclidean data

Abstract

Access this article