Non-Greedy L21-Norm Maximization for Principal Component Analysis,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Non-Greedy L21-Norm Maximization for Principal Component Analysis
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2021-05-19 , DOI: 10.1109/tip.2021.3073282
Feiping Nie , Lai Tian , Heng Huang , Chris Ding

Principal Component Analysis (PCA) is one of the most important unsupervised methods to handle high-dimensional data. However, due to the high computational complexity of its eigen-decomposition solution, it is hard to apply PCA to the large-scale data with high dimensionality, e.g., millions of data points with millions of variables. Meanwhile, the squared L2-norm based objective makes it sensitive to data outliers. In recent research, the L1-norm maximization based PCA method was proposed for efficient computation and being robust to outliers. However, this work used a greedy strategy to solve the eigenvectors. Moreover, the L1-norm maximization based objective may not be the correct robust PCA formulation, because it loses the theoretical connection to the minimization of data reconstruction error, which is one of the most important intuitions and goals of PCA. In this paper, we propose to maximize the L21-norm based robust PCA objective, which is theoretically connected to the minimization of reconstruction error. More importantly, we propose the efficient non-greedy optimization algorithms to solve our objective and the more general L21-norm maximization problem with theoretically guaranteed convergence. Experimental results on real world data sets show the effectiveness of the proposed method for principal component analysis.

中文翻译：

主成分分析的非贪婪 L21 范数最大化

主成分分析（PCA）是处理高维数据最重要的无监督方法之一。然而，由于其特征分解解决方案的计算复杂度很高，PCA很难应用于高维的大规模数据，例如数百万个数据点和数百万个变量。同时，基于平方 L2 范数的目标使其对数据异常值敏感。在最近的研究中，提出了基于 L1 范数最大化的 PCA 方法，以实现高效计算并对异常值具有鲁棒性。然而，这项工作使用了贪婪策略来求解特征向量。此外，基于 L1 范数最大化的目标可能不是正确的鲁棒 PCA 公式，因为它失去了与数据重构误差最小化的理论联系，而数据重构误差最小化是 PCA 最重要的直觉和目标之一。在本文中，我们建议最大化基于 L21 范数的鲁棒 PCA 目标，这在理论上与重构误差的最小化有关。更重要的是，我们提出了有效的非贪婪优化算法来解决我们的目标和更一般的 L21 范数最大化问题，并在理论上保证收敛。真实世界数据集的实验结果表明了所提出的主成分分析方法的有效性。

更新日期：2021-05-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11