当前位置: X-MOL 学术IEEE Trans. Inform. Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Global Optimization Geometry of Low-Rank Matrix Optimization
IEEE Transactions on Information Theory ( IF 2.5 ) Pub Date : 2021-02-01 , DOI: 10.1109/tit.2021.3049171
Zhihui Zhu , Qiuwei Li , Gongguo Tang , Michael B. Wakin

This paper considers general rank-constrained optimization problems that minimize a general objective function ${f}( {X})$ over the set of rectangular ${n}\times {m}$ matrices that have rank at most r. To tackle the rank constraint and also to reduce the computational burden, we factorize $ {X}$ into $ {U} {V} ^{\mathrm {T}}$ where $ {U}$ and $ {V}$ are ${n}\times {r}$ and ${m}\times {r}$ matrices, respectively, and then optimize over the small matrices $ {U}$ and $ {V}$ . We characterize the global optimization geometry of the nonconvex factored problem and show that the corresponding objective function satisfies the robust strict saddle property as long as the original objective function f satisfies restricted strong convexity and smoothness properties, ensuring global convergence of many local search algorithms (such as noisy gradient descent) in polynomial time for solving the factored problem. We also provide a comprehensive analysis for the optimization geometry of a matrix factorization problem where we aim to find ${n}\times {r}$ and ${m}\times {r}$ matrices $ {U}$ and $ {V}$ such that $ {U} {V} ^{\mathrm {T}}$ approximates a given matrix $ {X}^\star $ . Aside from the robust strict saddle property, we show that the objective function of the matrix factorization problem has no spurious local minima and obeys the strict saddle property not only for the exact-parameterization case where $\mathrm {rank}( {X}^\star) = {r}$ , but also for the over-parameterization case where $\mathrm {rank}( {X}^\star) < {r}$ and the under-parameterization case where $\mathrm {rank}( {X}^\star) > {r}$ . These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) converge to a global solution with random initialization.

中文翻译:

低秩矩阵优化的全局优化几何

本文考虑了最小化一般目标函数的一般秩约束优化问题 ${f}( {X})$ 在矩形集上 ${n}\times {m}$ 秩最多为 r 的矩阵。为了解决秩约束并减少计算负担,我们分解 ${X}$ 进入 $ {U} {V} ^{\mathrm {T}}$ 在哪里 ${U}$ $ {V}$ ${n}\times {r}$ ${m}\times {r}$ 矩阵,然后优化小矩阵 ${U}$ $ {V}$ . 我们刻画了非凸分解问题的全局优化几何,并表明只要原始目标函数 f 满足受限强凸性和平滑性,相应的目标函数就满足鲁棒严格鞍座性质,从而确保许多局部搜索算法(例如作为噪声梯度下降)在多项式时间内解决分解问题。我们还对矩阵分解问题的优化几何进行了全面分析,我们的目标是找到 ${n}\times {r}$ ${m}\times {r}$ 矩阵 ${U}$ $ {V}$ 以至于 $ {U} {V} ^{\mathrm {T}}$ 逼近给定的矩阵 $ {X}^\star $ . 除了鲁棒的严格鞍座性质,我们表明矩阵分解问题的目标函数没有虚假的局部最小值,并且不仅在精确参数化的情况下遵守严格鞍座性质,其中 $\mathrm {rank}( {X}^\star) = {r}$ ,但也适用于过度参数化的情况,其中 $\mathrm {rank}( {X}^\star) < {r}$ 以及参数化不足的情况,其中 $\mathrm {rank}( {X}^\star) > {r}$ . 这些几何特性意味着许多迭代优化算法(例如梯度下降)会收敛到具有随机初始化的全局解决方案。
更新日期:2021-02-01
down
wechat
bug