当前位置: X-MOL 学术Symmetry › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving Density Peak Clustering by Automatic Peak Selection and Single Linkage Clustering
Symmetry ( IF 2.940 ) Pub Date : 2020-07-14 , DOI: 10.3390/sym12071168
Jun-Lin Lin , Jen-Chieh Kuo , Hsing-Wang Chuang

Density peak clustering (DPC) is a density-based clustering method that has attracted much attention in the academic community. DPC works by first searching density peaks in the dataset, and then assigning each data point to the same cluster as its nearest higher-density point. One problem with DPC is the determination of the density peaks, where poor selection of the density peaks could yield poor clustering results. Another problem with DPC is its cluster assignment strategy, which often makes incorrect cluster assignments for data points that are far from their nearest higher-density points. This study modifies DPC and proposes a new clustering algorithm to resolve the above problems. The proposed algorithm uses the radius of the neighborhood to automatically select a set of the likely density peaks, which are far from their nearest higher-density points. Using the potential density peaks as the density peaks, it then applies DPC to yield the preliminary clustering results. Finally, it uses single-linkage clustering on the preliminary clustering results to reduce the number of clusters, if necessary. The proposed algorithm avoids the cluster assignment problem in DPC because the cluster assignments for the potential density peaks are based on single-linkage clustering, not based on DPC. Our performance study shows that the proposed algorithm outperforms DPC for datasets with irregularly shaped clusters.

中文翻译:

通过自动峰选择和单连锁聚类改进密度峰聚类

密度峰值聚类(DPC)是一种基于密度的聚类方法,在学术界备受关注。DPC 首先搜索数据集中的密度峰值,然后将每个数据点分配到与其最近的高密度点相同的集群。DPC 的一个问题是密度峰的确定,其中密度峰的选择不当可能会产生较差的聚类结果。DPC 的另一个问题是它的集群分配策略,它经常为远离最近的高密度点的数据点进行错误的集群分配。本研究修改了DPC,提出了一种新的聚类算法来解决上述问题。所提出的算法使用邻域的半径来自动选择一组可能的密度峰值,离它们最近的高密度点很远。使用潜在的密度峰值作为密度峰值,然后应用 DPC 来产生初步的聚类结果。最后,如有必要,它对初步聚类结果使用单链接聚类以减少聚类数量。所提出的算法避免了 DPC 中的聚类分配问题,因为潜在密度峰值的聚类分配基于单链接聚类,而不是基于 DPC。我们的性能研究表明,对于具有不规则形状集群的数据集,所提出的算法优于 DPC。如有必要,它对初步聚类结果使用单链接聚类以减少聚类数量。所提出的算法避免了 DPC 中的聚类分配问题,因为潜在密度峰值的聚类分配基于单链接聚类,而不是基于 DPC。我们的性能研究表明,对于具有不规则形状集群的数据集,所提出的算法优于 DPC。如有必要,它对初步聚类结果使用单链接聚类以减少聚类数量。所提出的算法避免了 DPC 中的聚类分配问题,因为潜在密度峰值的聚类分配基于单链接聚类,而不是基于 DPC。我们的性能研究表明,对于具有不规则形状集群的数据集,所提出的算法优于 DPC。
更新日期:2020-07-14
down
wechat
bug