Non-Redundant Subspace Clusterings with Nr-Kmeans and Nr-DipMeans,ACM Transactions on Knowledge Discovery from Data

当前位置： X-MOL 学术 › ACM Trans. Knowl. Discov. Data › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Non-Redundant Subspace Clusterings with Nr-Kmeans and Nr-DipMeans
ACM Transactions on Knowledge Discovery from Data ( IF 3.6 ) Pub Date : 2020-06-22 , DOI: 10.1145/3385652
Dominik Mautz ₁ , Wei Ye ₂ , Claudia Plant ₃ , Christian Böhm ₁

Affiliation

A huge object collection in high-dimensional space can often be clustered in more than one way, for instance, objects could be clustered by their shape or alternatively by their color. Each grouping represents a different view of the dataset. The new research field of non-redundant clustering addresses this class of problems. In this article, we follow the approach that different, non-redundant k -means-like clusterings may exist in different, arbitrarily oriented subspaces of the high-dimensional space. We assume that these subspaces (and optionally a further noise space without any cluster structure) are orthogonal to each other. This assumption enables a particularly rigorous mathematical treatment of the non-redundant clustering problem and thus a particularly efficient algorithm, which we call N r -K means (for non-redundant k -means). The superiority of our algorithm is demonstrated both theoretically, as well as in extensive experiments. Further, we propose an extension of N r -K means that harnesses Hartigan’s dip test to identify the number of clusters for each subspace automatically.

中文翻译：

具有 Nr-Kmeans 和 Nr-DipMeans 的非冗余子空间聚类

高维空间中的巨大对象集合通常可以通过多种方式进行聚类，例如，对象可以按其形状或颜色进行聚类。每个分组代表数据集的不同视图。新的研究领域非冗余聚类解决了这类问题。在本文中，我们遵循不同的、非冗余的方法ķ-means-like 聚类可能存在于高维空间的不同的、任意方向的子空间中。我们假设这些子空间（以及可选的进一步噪音空间没有任何簇结构）彼此正交。这个假设能够对非冗余聚类问题进行特别严格的数学处理，因此是一种特别有效的算法，我们称之为 Nr-K方法（对于非冗余ķ-方法）。我们算法的优越性在理论上和大量实验中都得到了证明。此外，我们建议扩展 Nr-K方法它利用 Hartigan 的 dip 测试自动识别每个子空间的集群数量。

更新日期：2020-06-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>