当前位置: X-MOL 学术SIAM J. Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the Convergence of Projected-Gradient Methods with Low-Rank Projections for Smooth Convex Minimization over Trace-Norm Balls and Related Problems
SIAM Journal on Optimization ( IF 2.6 ) Pub Date : 2021-03-01 , DOI: 10.1137/18m1233170
Dan Garber

SIAM Journal on Optimization, Volume 31, Issue 1, Page 727-753, January 2021.
Smooth convex minimization over the unit trace-norm ball is an important optimization problem in machine learning, signal processing, statistics, and other fields that underlies many tasks in which one wishes to recover a low-rank matrix given certain measurements. While first-order methods for convex optimization enjoy optimal convergence rates, they require in the worst-case to compute a full-rank SVD on each iteration, in order to compute the Euclidean projection onto the trace-norm ball. These full-rank SVD computations, however, prohibit the application of such methods to large-scale problems. A simple and natural heuristic to reduce the computational cost of such methods is to approximate the Euclidean projection using only a low-rank SVD. This raises the question if, and under what conditions, this simple heuristic can indeed result in provable convergence to the optimal solution. In this paper we show that any optimal solution is a center of a Euclidean ball inside which the projected-gradient mapping admits a rank that is at most the multiplicity of the largest singular value of the gradient vector at this optimal point. Moreover, the radius of the ball scales with the spectral gap of this gradient vector. We show how this readily implies the local convergence (i.e., from a “warm-start" initialization) of standard first-order methods such as the projected-gradient method and accelerated gradient methods, using only low-rank SVD computations. We also quantify the effect of “over-parameterization," i.e., using SVD computations with higher rank, on the radius of this ball, showing it can increase dramatically with moderately larger rank. We extend our results also to the setting of smooth convex minimization with trace-norm regularization and smooth convex optimization over bounded-trace positive semidefinite matrices. Our theoretical investigation is supported by concrete empirical evidence that demonstrates the correct convergence of first-order methods with low-rank projections for the matrix completion task on real-world datasets.


中文翻译:

关于低范数球上光滑凸最小化的低秩投影投影梯度方法的收敛性及相关问题

SIAM优化杂志,第31卷,第1期,第727-753页,2021年1月。
在机器学习,信号处理,统计和其他领域中,单位跟踪范数球上的平滑凸极小化是一个重要的优化问题,它是许多任务的基础,在这些任务中,人们希望在给定某些度量的情况下恢复低秩矩阵。尽管用于凸优化的一阶方法具有最佳收敛速度,但在最坏情况下,它们需要在每次迭代中计算一个全秩SVD,以便计算在跟踪范数球上的欧几里得投影。但是,这些完整的SVD计算禁止将此类方法应用于大规模问题。减少此类方法的计算成本的一种简单自然的启发式方法是仅使用低阶SVD近似欧几里得投影。这就提出了一个问题,即在什么条件下,这种简单的启发式方法确实可以导致可证明的收敛于最佳解决方案。在本文中,我们表明,任何最优解都是欧几里得球的中心,投影梯度映射在该中心内允许的秩是该最优点处梯度向量的最大奇异值的乘积。而且,球的半径与该梯度矢量的光谱间隙成比例。我们将说明仅使用低阶SVD计算,这如何容易地暗示标准一阶方法(例如投影梯度法和加速梯度法)的局部收敛(即,从“热启动”初始化)。 “过度参数化”(即使用具有更高等级的SVD计算)对该球半径的影响,表明随着等级的提高,它可以显着增加。我们还将结果扩展到使用迹范正则化的光滑凸极小化设置和有界迹正半定矩阵上的光滑凸最优化。我们的理论研究得到了具体的经验证据的支持,这些证据证明了针对低阶投影的一阶方法对于真实数据集上矩阵完成任务的正确收敛。
更新日期:2021-03-21
down
wechat
bug