当前位置: X-MOL 学术IEEE Trans. Cybern. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Progressive Self-Supervised Clustering With Novel Category Discovery.
IEEE Transactions on Cybernetics ( IF 11.8 ) Pub Date : 2022-09-19 , DOI: 10.1109/tcyb.2021.3069836
Jingyu Wang 1 , Zhenyu Ma 2 , Feiping Nie 3 , Xuelong Li 4
Affiliation  

These days, clustering is one of the most classical themes to analyze data structures in machine learning and pattern recognition. Recently, the anchor-based graph has been widely adopted to promote the clustering accuracy of plentiful graph-based clustering techniques. In order to achieve more satisfying clustering performance, we propose a novel clustering approach referred to as the progressive self-supervised clustering method with novel category discovery (PSSCNCD), which consists of three separate procedures specifically. First, we propose a new semisupervised framework with novel category discovery to guide label propagation processing, which is reinforced by the parameter-insensitive anchor-based graph obtained from balanced K -means and hierarchical K -means (BKHK). Second, we design a novel representative point selected strategy based on our semisupervised framework to discover each representative point and endow pseudolabel progressively, where every pseudolabel hypothetically corresponds to a real category in each self-supervised label propagation. Third, when sufficient representative points have been found, the labels of all samples will be finally predicted to obtain terminal clustering results. In addition, the experimental results on several toy examples and benchmark data sets comprehensively demonstrate that our method outperforms other clustering approaches.

中文翻译:

具有新颖类别发现的渐进式自我监督聚类。

如今,聚类是分析机器学习和模式识别中数据结构的最经典主题之一。最近,基于锚的图已被广泛采用,以提高大量基于图的聚类技术的聚类精度。为了获得更令人满意的聚类性能,我们提出了一种新颖的聚类方法,称为具有新类别发现的渐进式自我监督聚类方法(PSSCNCD),具体由三个独立的程序组成。首先,我们提出了一种新的半监督框架,该框架具有新颖的类别发现来指导标签传播处理,该框架通过从平衡 K 均值和分层 K 均值 (BKHK) 获得的参数不敏感的基于锚的图得到加强。第二,我们基于我们的半监督框架设计了一种新颖的代表点选择策略,以逐步发现每个代表点并赋予伪标签,其中每个伪标签假设对应于每个自监督标签传播中的真实类别。第三,当找到足够多的代表点后,最终预测所有样本的标签,得到终端聚类结果。此外,几个玩具示例和基准数据集的实验结果全面表明我们的方法优于其他聚类方法。当找到足够多的代表点后,最终预测所有样本的标签,得到终端聚类结果。此外,几个玩具示例和基准数据集的实验结果全面表明我们的方法优于其他聚类方法。当找到足够多的代表点后,最终预测所有样本的标签,得到终端聚类结果。此外,几个玩具示例和基准数据集的实验结果全面表明我们的方法优于其他聚类方法。
更新日期:2021-04-20
down
wechat
bug