Initializing k-means Clustering by Bootstrap and Data Depth,Journal of Classification

当前位置： X-MOL 学术 › J. Classif. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Initializing k-means Clustering by Bootstrap and Data Depth
Journal of Classification ( IF 1.8 ) Pub Date : 2020-07-24 , DOI: 10.1007/s00357-020-09372-3
Aurora Torrente , Juan Romo

The k-means algorithm is widely used in various research fields because of its fast convergence to the cost function minima; however, it frequently gets stuck in local optima as it is sensitive to initial conditions. This paper explores a simple, computationally feasible method, which provides k-means with a set of initial seeds to cluster datasets of arbitrary dimensions. Our technique consists of two stages: firstly, we use the original data space to obtain a set of prototypes (cluster centers) by applying k-means to bootstrap replications of the data and, secondly, we cluster the space of centers, which has tighter (thus easier to separate) groups, and search the deepest point in each assembled cluster using a depth notion. We test this method with simulated and real data, compare it with commonly used k-means initialization algorithms, and show that it is feasible and more efficient than previous proposals in many situations.

中文翻译：

通过 Bootstrap 和数据深度初始化 k-means 聚类

k-means算法因其对代价函数最小值的快速收敛而被广泛应用于各个研究领域；然而，它经常陷入局部最优，因为它对初始条件很敏感。本文探索了一种简单的、计算上可行的方法，该方法为 k-means 提供了一组初始种子，以对任意维度的数据集进行聚类。我们的技术包括两个阶段：首先，我们使用原始数据空间通过应用 k-means 对数据进行引导复制来获得一组原型（聚类中心），其次，我们对中心空间进行聚类，它具有更紧密的（因此更容易分离）组，并使用深度概念搜索每个组装集群中的最深点。我们用模拟和真实数据测试这个方法，将它与常用的 k-means 初始化算法进行比较，

更新日期：2020-07-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11