当前位置: X-MOL 学术Found. Comput. Math. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Certifying Global Optimality of Graph Cuts via Semidefinite Relaxation: A Performance Guarantee for Spectral Clustering
Foundations of Computational Mathematics ( IF 2.5 ) Pub Date : 2019-06-13 , DOI: 10.1007/s10208-019-09421-3
Shuyang Ling , Thomas Strohmer

Spectral clustering has become one of the most widely used clustering techniques when the structure of the individual clusters is non-convex or highly anisotropic. Yet, despite its immense popularity, there exists fairly little theory about performance guarantees for spectral clustering. This issue is partly due to the fact that spectral clustering typically involves two steps which complicated its theoretical analysis: First, the eigenvectors of the associated graph Laplacian are used to embed the dataset, and second, k-means clustering algorithm is applied to the embedded dataset to get the labels. This paper is devoted to the theoretical foundations of spectral clustering and graph cuts. We consider a convex relaxation of graph cuts, namely ratio cuts and normalized cuts, that makes the usual two-step approach of spectral clustering obsolete and at the same time gives rise to a rigorous theoretical analysis of graph cuts and spectral clustering. We derive deterministic bounds for successful spectral clustering via a spectral proximity condition that naturally depends on the algebraic connectivity of each cluster and the inter-cluster connectivity. Moreover, we demonstrate by means of some popular examples that our bounds can achieve near optimality. Our findings are also fundamental to the theoretical understanding of kernel k-means. Numerical simulations confirm and complement our analysis.

中文翻译:

通过半定松弛来证明图割的全局最优性:光谱聚类的性能保证

当单个簇的结构为非凸或高度各向异性时,光谱簇已成为最广泛使用的簇技术之一。然而,尽管它非常受欢迎,但是关于频谱聚类的性能保证的理论还很少。该问题部分是由于以下事实:频谱聚类通常涉及两个步骤,这使其理论分析变得复杂:首先,使用关联图Laplacian的特征向量嵌入数据集,其次,将k均值聚类算法应用于嵌入式获取标签的数据集。本文致力于频谱聚类和图割的理论基础。我们考虑图削减的凸松弛,即比率削减和归一化削减,这使得通常的两步谱聚类方法过时了,同时又对图形切割和谱聚类进行了严格的理论分析。我们通过a得出成功频谱聚类的确定性边界 光谱邻近条件自然取决于每个群集的代数连接性和群集间连接性。此外,我们通过一些流行的例子证明了我们的边界可以达到接近最优。我们的发现对于内核k均值的理论理解也是基础。数值模拟证实并补充了我们的分析。
更新日期:2019-06-13
down
wechat
bug