当前位置: X-MOL 学术Constr. Approx. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Manifold Approximation by Moving Least-Squares Projection (MMLS)
Constructive Approximation ( IF 2.3 ) Pub Date : 2019-12-16 , DOI: 10.1007/s00365-019-09489-8
Barak Sober , David Levin

In order to avoid the curse of dimensionality, frequently encountered in big data analysis, there has been vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower-dimensional manifold; thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real-life applications, data is often very noisy. In this work, we propose a method to approximate \(\mathcal {M}\) a d-dimensional \(C^{m+1}\) smooth submanifold of \(\mathbb {R}^n\) (\(d \ll n\)) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located “near” the lower-dimensional manifold and suggest a nonlinear moving least-squares projection on an approximating d-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., \(\mathcal {O}(h^{m+1})\), where h is the fill distance and m is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension n. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high-dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.



中文翻译:

移动最小二乘投影(MMLS)的流形近似

为了避免大数据分析中经常遇到的维数的诅咒,近年来,线性和非线性维数减少技术领域已经有了巨大的发展。这些技术(有时称为流形学习)假设分散的输入数据位于较低维的流形上。因此,高维问题可以通过学习低维行为来克服。但是,在实际应用中,数据通常非常嘈杂。在这项工作中,我们提出了一种方法,以近似\(\ mathcal {M} \)一个d\(C ^ {M + 1} \)光滑子流形的\(\ mathbb {R} ^ N \) \ (d \ ll n \))基于嘈杂的分散数据点(即数据云)。我们假设数据点位于较低维流形的“附近”,并建议在近似d维流形上进行非线性移动最小二乘投影。在一些温和的假设下,所得的近似值显示为无限平滑且具有较高的近似阶数(即\(\ mathcal {O}(h ^ {m + 1})\),其中h是填充距离,m是局部多项式近似的程度)。此处介绍的方法假设不具备近似流形的解析知识,并且近似算法在大维n中为线性。此外,近似歧管可以用作以计算有效的方式直接对高维数据执行操作的框架。这样,可以完全避免尺寸缩减的准备步骤,该步骤会导致数据失真。

更新日期:2019-12-16
down
wechat
bug