当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Joint image clustering and feature selection with auto-adjoined learning for high-dimensional data
Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2021-08-27 , DOI: 10.1016/j.knosys.2021.107443
Xiaodong Wang 1 , Pengtao Wu 1 , Qinghua Xu 1 , Zhiqiang Zeng 1 , Yong Xie 2
Affiliation  

Due to the rapid development of modern multimedia techniques, high-dimensional image data are frequently encountered in many image analysis communities, such as clustering and feature learning. K-means (KM) is one of the widely-used and efficient tools for clustering high-dimensional data. However, as the commonly contained irrelevant features or noise, conventional KMs suffer from degraded performance for high-dimensional data. Recent studies try to overcome this problem by combining KMs with subspace learning. Nevertheless, they usually depend on complex eigenvalue decomposition, which needs expensive computation resources. Besides, their clustering models also ignore the local manifold structure among data, failing to utilize the underlying adjacent information. Two points are critical for clustering high-dimensional image data: efficient feature selecting and clear adjacency exploring. Based on the above consideration, we propose an auto-adjoined subspace clustering. Concretely, to efficiently locate the redundant features, we impose an extremely sparse feature selection matrix into KM, which is easy to be optimized. Besides, to accurately encode the local adjacency among data without the influence of noise, we propose to automatically assign the connectivity of each sample in the low-dimensional feature space. Compared with several state-of-the-art clustering methods, the proposed method constantly improves the clustering performance on six publicly available benchmark image datasets, demonstrating the effectiveness of our method.



中文翻译:

用于高维数据的具有自动邻接学习的联合图像聚类和特征选择

由于现代多媒体技术的快速发展,高维图像数据在许多图像分析社区中经常遇到,例如聚类和特征学习。K-means (KM) 是一种广泛使用且高效的高维数据聚类工具。然而,由于通常包含不相关的特征或噪声,传统的知识管理系统在处理高维数据时性能下降。最近的研究试图通过将 KM 与子空间学习相结合来克服这个问题。然而,它们通常依赖于复杂的特征值分解,这需要昂贵的计算资源。此外,他们的聚类模型也忽略了数据之间的局部流形结构,未能利用潜在的相邻信息。有两点对于聚类高维图像数据至关重要:高效的特征选择和清晰的邻接探索。基于上述考虑,我们提出了一种自动邻接子空间聚类。具体来说,为了有效地定位冗余特征,我们将极稀疏的特征选择矩阵强加到 KM 中,该矩阵易于优化。此外,为了在不受噪声影响的情况下准确编码数据之间的局部邻接,我们建议在低维特征空间中自动分配每个样本的连通性。与几种最先进的聚类方法相比,所提出的方法在六个公开可用的基准图像数据集上不断提高聚类性能,证明了我们方法的有效性。为了有效地定位冗余特征,我们将极稀疏的特征选择矩阵强加到 KM 中,该矩阵易于优化。此外,为了在不受噪声影响的情况下准确编码数据之间的局部邻接,我们建议在低维特征空间中自动分配每个样本的连通性。与几种最先进的聚类方法相比,所提出的方法在六个公开可用的基准图像数据集上不断提高聚类性能,证明了我们方法的有效性。为了有效地定位冗余特征,我们将极稀疏的特征选择矩阵强加到 KM 中,该矩阵易于优化。此外,为了在不受噪声影响的情况下准确编码数据之间的局部邻接,我们建议在低维特征空间中自动分配每个样本的连通性。与几种最先进的聚类方法相比,所提出的方法在六个公开可用的基准图像数据集上不断提高聚类性能,证明了我们方法的有效性。我们建议在低维特征空间中自动分配每个样本的连通性。与几种最先进的聚类方法相比,所提出的方法在六个公开可用的基准图像数据集上不断提高聚类性能,证明了我们方法的有效性。我们建议在低维特征空间中自动分配每个样本的连通性。与几种最先进的聚类方法相比,所提出的方法在六个公开可用的基准图像数据集上不断提高聚类性能,证明了我们方法的有效性。

更新日期:2021-09-10
down
wechat
bug