当前位置: X-MOL 学术J. Comput. Graph. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving Spectral Clustering using the Asymptotic Value of the Normalised Cut
Journal of Computational and Graphical Statistics ( IF 1.4 ) Pub Date : 2019-05-20 , DOI: 10.1080/10618600.2019.1593180
David P. Hofmeyr 1
Affiliation  

Abstract Spectral clustering (SC) is a popular and versatile clustering method based on a relaxation of the normalized graph cut objective. Despite its popularity, selecting the number of clusters and tuning the important scaling parameter remain challenging problems in practical applications of SC. Popular heuristics have been proposed, but corresponding theoretical results are scarce. In this article, we investigate the asymptotic value of the normalized cut for an increasing sample assumed to arise from an underlying probability distribution. Based on this, we find strong connections between spectral and density clustering. This enables us to provide recommendations for selecting the number of clusters and setting the scaling parameter in a data driven manner. An algorithm inspired by these recommendations is proposed, which we have found to exhibit strong performance in a range of applied domains. An R implementation of the algorithm is available from https://github.com/DavidHofmeyr/spuds. Supplementary materials for this article are available online.

中文翻译:

使用归一化切割的渐近值改进谱聚类

摘要 谱聚类 (SC) 是一种流行且通用的聚类方法,它基于归一化图切割目标的松弛。尽管它很受欢迎,但在 SC 的实际应用中,选择集群数量和调整重要的缩放参数仍然是具有挑战性的问题。已经提出了流行的启发式方法,但相应的理论结果却很少。在本文中,我们研究了假设来自潜在概率分布的增加样本的归一化切割的渐近值。基于此,我们发现谱聚类和密度聚类之间有很强的联系。这使我们能够为选择集群数量和以数据驱动的方式设置缩放参数提供建议。提出了一种受这些建议启发的算法,我们发现它在一系列应用领域表现出强大的性能。该算法的 R 实现可从 https://github.com/DavidHofmeyr/spuds 获得。本文的补充材料可在线获取。
更新日期:2019-05-20
down
wechat
bug