当前位置: X-MOL 学术IEEE Trans. Cybern. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Discrete Optimal Graph Clustering
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 12-7-2018 , DOI: 10.1109/tcyb.2018.2881539
Yudong Han , Lei Zhu , Zhiyong Cheng , Jingjing Li , Xiaobai Liu

Graph-based clustering is one of the major clustering methods. Most of it works in three separate steps: 1) similarity graph construction; 2) clustering label relaxing; and 3) label discretization with k-means (KM). Such common practice has three disadvantages: 1) the predefined similarity graph is often fixed and may not be optimal for the subsequent clustering; 2) the relaxing process of cluster labels may cause significant information loss; and 3) label discretization may deviate from the real clustering result since KM is sensitive to the initialization of cluster centroids. To tackle these problems, in this paper, we propose an effective discrete optimal graph clustering framework. A structured similarity graph that is theoretically optimal for clustering performance is adaptively learned with a guidance of reasonable rank constraints. Besides, to avoid the information loss, we explicitly enforce a discrete transformation on the intermediate continuous label, which derives a tractable optimization problem with a discrete solution. Furthermore, to compensate for the unreliability of the learned labels and enhance the clustering accuracy, we design an adaptive robust module that learns the prediction function for the unseen data based on the learned discrete cluster labels. Finally, an iterative optimization strategy guaranteed with convergence is developed to directly solve the clustering results. Extensive experiments conducted on both real and synthetic datasets demonstrate the superiority of our proposed methods compared with several state-of-the-art clustering approaches.

中文翻译:


离散最优图聚类



基于图的聚类是主要的聚类方法之一。其中大部分工作分为三个单独的步骤:1)相似图构建; 2)聚类标签放宽; 3) 使用 k 均值 (KM) 进行标签离散化。这种常见做法具有三个缺点:1)预定义的相似度图通常是固定的,对于后续的聚类可能不是最佳的; 2)簇标签的松弛过程可能会导致显着的信息丢失; 3)由于KM对簇质心的初始化敏感,标签离散化可能会偏离真实的聚类结果。为了解决这些问题,在本文中,我们提出了一种有效的离散最优图聚类框架。在合理的排序约束的指导下,自适应地学习理论上对于聚类性能而言最佳的结构化相似图。此外,为了避免信息丢失,我们明确地对中间连续标签强制执行离散变换,从而导出具有离散解决方案的易于处理的优化问题。此外,为了补偿学习标签的不可靠性并提高聚类精度,我们设计了一个自适应鲁棒模块,该模块根据学习的离散聚类标签学习未见数据的预测函数。最后,开发了保证收敛的迭代优化策略来直接求解聚类结果。在真实和合成数据集上进行的大量实验证明了我们提出的方法与几种最先进的聚类方法相比的优越性。
更新日期:2024-08-22
down
wechat
bug