当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Self-Evolution Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 12-27-2018 , DOI: 10.1109/tpami.2018.2889949
Jianlong Chang , Gaofeng Meng , Lingfeng Wang , Shiming Xiang , Chunhong Pan

Clustering is a crucial but challenging task in pattern analysis and machine learning. Existing methods often ignore the combination between representation learning and clustering. To tackle this problem, we reconsider the clustering task from its definition to develop Deep Self-Evolution Clustering (DSEC) to jointly learn representations and cluster data. For this purpose, the clustering task is recast as a binary pairwise-classification problem to estimate whether pairwise patterns are similar. Specifically, similarities between pairwise patterns are defined by the dot product between indicator features which are generated by a deep neural network (DNN). To learn informative representations for clustering, clustering constraints are imposed on the indicator features to represent specific concepts with specific representations. Since the ground-truth similarities are unavailable in clustering, an alternating iterative algorithm called Self-Evolution Clustering Training (SECT) is presented to select similar and dissimilar pairwise patterns and to train the DNN alternately. Consequently, the indicator features tend to be one-hot vectors and the patterns can be clustered by locating the largest response of the learned indicator features. Extensive experiments strongly evidence that DSEC outperforms current models on twelve popular image, text and audio datasets consistently.

中文翻译:


深度自进化聚类



聚类是模式分析和机器学习中一项至关重要但具有挑战性的任务。现有的方法常常忽略表示学习和聚类之间的结合。为了解决这个问题,我们从定义上重新考虑聚类任务,开发深度自进化聚类(DSEC)来共同学习表示和聚类数据。为此,聚类任务被重新设计为二元成对分类问题,以估计成对模式是否相似。具体来说,成对模式之间的相似性由深度神经网络(DNN)生成的指标特征之间的点积来定义。为了学习聚类的信息表示,对指标特征施加聚类约束,以用特定表示来表示特定概念。由于真实相似度在聚类中不可用,因此提出了一种称为自进化聚类训练 (SECT) 的交替迭代算法来选择相似和不相似的成对模式并交替训练 DNN。因此,指示符特征往往是独热向量,并且可以通过定位所学习的指示符特征的最大响应来对模式进行聚类。大量实验有力地证明,DSEC 在 12 个流行图像、文本和音频数据集上始终优于当前模型。
更新日期:2024-08-22
down
wechat
bug