当前位置: X-MOL 学术Neural Process Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering
Neural Processing Letters ( IF 2.6 ) Pub Date : 2020-10-19 , DOI: 10.1007/s11063-020-10375-9
Xiaofei Zhu , Khoi Duy Do , Jiafeng Guo , Jun Xu , Stefan Dietze

Clustering is an essential data analysis technique and has been studied extensively over the last decades. Previous studies have shown that data representation and data structure information are two critical factors for improving clustering performance, and it forms two important lines of research. The first line of research attempts to learn representative features, especially utilizing the deep neural networks, for handling clustering problems. The second concerns exploiting the geometric structure information within data for clustering. Although both of them have achieved promising performance in lots of clustering tasks, few efforts have been dedicated to combine them in a unified deep clustering framework, which is the research gap we aim to bridge in this work. In this paper, we propose a novel approach, Manifold regularized Deep Embedded Clustering (MDEC), to deal with the aforementioned challenge. It simultaneously models data generating distribution, cluster assignment consistency, as well as geometric structure of data in a unified framework. The proposed method can be optimized by performing mini-batch stochastic gradient descent and back-propagation. We evaluate MDEC on three real-world datasets (USPS, REUTERS-10K, and MNIST), where experimental results demonstrate that our model outperforms baseline models and obtains the state-of-the-art performance.



中文翻译:

探索用于深度嵌入式集群的数据的隐式和显式几何结构

聚类是必不可少的数据分析技术,并且在过去的几十年中进行了广泛的研究。先前的研究表明,数据表示和数据结构信息是提高聚类性能的两个关键因素,并且构成了两个重要的研究方向。一线研究试图学习代表性特征,尤其是利用深度神经网络来处理聚类问题。第二个问题是利用数据中的几何结构信息进行聚类。尽管它们在许多聚类任务中均取得了令人鼓舞的性能,但很少有人致力于将它们组合到统一的深度聚类框架中,这是我们旨在弥补这一工作的研究空白。在本文中,我们提出了一种新颖的方法,集成块规范化了深度嵌入式群集(MDEC),以应对上述挑战。它在统一框架中同时对数据生成分布,群集分配一致性以及数据的几何结构进行建模。可以通过执行小批量随机梯度下降和反向传播来优化所提出的方法。我们在三个真实的数据集(USPS,REUTERS-10K和MNIST)上评估了MDEC,其中实验结果表明,我们的模型优于基准模型并获得了最新的性能。可以通过执行小批量随机梯度下降和反向传播来优化所提出的方法。我们在三个真实的数据集(USPS,REUTERS-10K和MNIST)上评估了MDEC,其中的实验结果表明,我们的模型优于基准模型并获得了最新的性能。可以通过执行小批量随机梯度下降和反向传播来优化所提出的方法。我们在三个真实的数据集(USPS,REUTERS-10K和MNIST)上评估了MDEC,其中实验结果表明,我们的模型优于基准模型并获得了最新的性能。

更新日期:2020-10-19
down
wechat
bug