当前位置: X-MOL 学术IEEE Trans. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Subspace clustering without knowing the number of clusters: A parameter free approach
IEEE Transactions on Signal Processing ( IF 4.6 ) Pub Date : 2020-01-01 , DOI: 10.1109/tsp.2020.3018665
Vishnu Menon , Gokularam Muthukrishnan , Sheetal Kalyani

Subspace clustering, the task of clustering high dimensional data when the data points come from a union of subspaces, is one of the fundamental tasks in unsupervised machine learning. Most of the existing algorithms for this task require prior knowledge of the number of clusters along with few additional parameters which need to be set or tuned apriori according to the type of data to be clustered. In this work, a parameter free method for subspace clustering is proposed, where the data points are clustered on the basis of the difference in the statistical distributions of the angles subtended by the data points within a subspace and those by points belonging to different subspaces. Given an initial fine clustering, the proposed algorithm merges the clusters until a final clustering is obtained. This, unlike many existing methods, does not require the number of clusters apriori. Also, the proposed algorithm does not involve the use of an unknown parameter or tuning for one. A parameter free method for producing a fine initial clustering is also discussed, making the whole process of subspace clustering parameter free. The comparison of the proposed algorithm's performance with that of the existing state-of-the-art techniques in synthetic and real data sets shows the significance of the proposed method.

中文翻译:

不知道簇数的子空间聚类:一种无参数方法

子空间聚类,即当数据点来自子空间的联合时对高维数据进行聚类的任务,是无监督机器学习中的基本任务之一。用于此任务的大多数现有算法都需要对集群数量的先验知识以及需要根据要聚类的数据类型预先设置或调整的少量附加参数。在这项工作中,提出了一种用于子空间聚类的无参数方法,其中数据点基于子空间内的数据点所对角与属于不同子空间的点所对角的统计分布的差异进行聚类。给定初始精细聚类,所提出的算法合并聚类直到获得最终聚类。这与许多现有方法不同,不需要先验的簇数。此外,所提出的算法不涉及使用未知参数或对其进行调整。还讨论了一种产生精细初始聚类的无参数方法,使子空间聚类的整个过程无参数。所提出算法的性能与合成和真实数据集中现有最先进技术的性能的比较表明了所提出方法的重要性。
更新日期:2020-01-01
down
wechat
bug