A Novel Convex Clustering Method for High-Dimensional Data Using Semiproximal ADMM,Mathematical Problems in Engineering

当前位置： X-MOL 学术 › Math. Probl. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Novel Convex Clustering Method for High-Dimensional Data Using Semiproximal ADMM
Mathematical Problems in Engineering ( IF 1.430 ) Pub Date : 2020-09-21 , DOI: 10.1155/2020/9216351
Huangyue Chen ₁ , Lingchen Kong ₁ , Yan Li ₂

Affiliation

Clustering is an important ingredient of unsupervised learning; classical clustering methods include K-means clustering and hierarchical clustering. These methods may suffer from instability because of their tendency prone to sink into the local optimal solutions of the nonconvex optimization model. In this paper, we propose a new convex clustering method for high-dimensional data based on the sparse group lasso penalty, which can simultaneously group observations and eliminate noninformative features. In this method, the number of clusters can be learned from the data instead of being given in advance as a parameter. We theoretically prove that the proposed method has desirable statistical properties, including a finite sample error bound and feature screening consistency. Furthermore, the semiproximal alternating direction method of multipliers is designed to solve the sparse group lasso convex clustering model, and its convergence analysis is established without any conditions. Finally, the effectiveness of the proposed method is thoroughly demonstrated through simulated experiments and real applications.

中文翻译：

使用近邻ADMM的高维数据凸聚类新方法

聚类是无监督学习的重要组成部分。经典的聚类方法包括K-means聚类和分层聚类。这些方法可能会陷入不稳定，因为它们倾向于陷入非凸优化模型的局部最优解中。在本文中，我们提出了一种基于稀疏组套索罚分的高维数据凸聚类方法，该方法可以同时对观测值进行分组，消除非信息性特征。在这种方法中，可以从数据中学习聚类的数目，而不是预先作为参数给出。我们从理论上证明了该方法具有令人满意的统计特性，包括有限的样本误差范围和特征筛选一致性。此外，为了解决稀疏群套索凸聚类模型，设计了乘数的半近交替方向法，建立了无条件的收敛性分析。最后，通过仿真实验和实际应用充分证明了该方法的有效性。

更新日期：2020-09-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>