当前位置: X-MOL 学术J. Stat. Comput. Simul. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On clustering uncertain and structured data with Wasserstein barycenters and a geodesic criterion for the number of clusters
Journal of Statistical Computation and Simulation ( IF 1.1 ) Pub Date : 2021-03-30 , DOI: 10.1080/00949655.2021.1903463
G. I. Papayiannis 1, 2 , G. N. Domazakis 1, 3 , D. Drivaliaris 4 , S. Koukoulas 5 , A. E. Tsekrekos 6 , A. N. Yannacopoulos 1
Affiliation  

Clustering schemes for uncertain and structured data are considered relying on the notion of Wasserstein barycenters, accompanied by appropriate clustering indices based on the intrinsic geometry of the Wasserstein space. Such type of clustering approaches are highly appreciated in many fields where the observational/experimental error is significant or the data nature is more complex and the traditional learning algorithms are not applicable or effective to treat. Under this perspective, each observation is identified by an appropriate probability measure and the proposed clustering schemes rely on discrimination criteria that utilize the geometric structure of the space of probability measures through core techniques from the optimal transport theory. The advantages and capabilities of the proposed approach and the geodesic criterion performance are illustrated through a simulation study and the implementation in two different applications: (a) clustering eurozone countries’ bond yield curves and (b) classifying satellite images to certain land uses categories.



中文翻译:

关于使用 Wasserstein 重心和聚类数的测地线标准对不确定和结构化数据进行聚类

不确定和结构化数据的聚类方案被认为依赖于 Wasserstein 重心的概念,伴随着基于 Wasserstein 空间的内在几何形状的适当聚类索引。这种类型的聚类方法在观察/实验误差显着或数据性质更复杂且传统学习算法不适用或无法有效处理的许多领域中受到高度赞赏。从这个角度来看,每个观察结果都由一个适当的概率度量来识别,并且所提出的聚类方案依赖于区分标准,该标准通过来自最优传输理论的核心技术利用概率度量空间的几何结构。

更新日期:2021-03-30
down
wechat
bug