当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improved Variance Reduction Methods for Riemannian Non-Convex Optimization.
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2022-10-04 , DOI: 10.1109/tpami.2021.3112139
Andi Han , Junbin Gao

Variance reduction is popular in accelerating gradient descent and stochastic gradient descent for optimization problems defined on both euclidean space and Riemannian manifold. This paper further improves on existing variance reduction methods for non-convex Riemannian optimization, including R-SVRG and R-SRG/R-SPIDER by providing a unified framework for batch size adaptation. Such framework is more general than the existing works by considering retraction and vector transport and mini-batch stochastic gradients. We show that the adaptive-batch variance reduction methods require lower gradient complexities for both general non-convex and gradient dominated functions, under both finite-sum and online optimization settings. Moreover, under the new framework, we complete the analysis of R-SVRG and R-SRG, which is currently missing in the literature. We prove convergence of R-SVRG with much simpler analysis, which leads to curvature-free complexity bounds. We also show improved results for R-SRG under double-loop convergence, which match the optimal complexities as the R-SPIDER. In addition, we prove the first online complexity results for R-SVRG and R-SRG. Lastly, we discuss the potential of adapting batch size for non-smooth, constrained and second-order Riemannian optimizers. Extensive experiments on a variety of applications support the analysis and claims in the paper.

中文翻译:

改进的黎曼非凸优化方差减少方法。

方差减少在加速梯度下降和随机梯度下降中很流行,用于在欧几里得空间和黎曼流形上定义的优化问题。本文进一步改进了现有的非凸黎曼优化方差减少方法,包括 R-SVRG 和 R-SRG/R-SPIDER,通过提供统一的批量大小适应框架。通过考虑回缩和矢量传输以及小批量随机梯度,这种框架比现有工作更通用。我们表明,在有限和和在线优化设置下,对于一般非凸函数和梯度主导函数,自适应批量方差减少方法都需要较低的梯度复杂度。此外,在新的框架下,我们完成了目前文献中缺失的R-SVRG和R-SRG的分析。我们通过更简单的分析证明了 R-SVRG 的收敛性,这导致了无曲率的复杂性界限。我们还展示了双环收敛下 R-SRG 的改进结果,与 R-SPIDER 的最佳复杂度相匹配。此外,我们证明了 R-SVRG 和 R-SRG 的第一个在线复杂性结果。最后,我们讨论了为非平滑、受限和二阶黎曼优化器调整批量大小的潜力。对各种应用的广泛实验支持了本文中的分析和主张。我们讨论了为非平滑、受限和二阶黎曼优化器调整批量大小的潜力。对各种应用的广泛实验支持了本文中的分析和主张。我们讨论了为非平滑、受限和二阶黎曼优化器调整批量大小的潜力。对各种应用的广泛实验支持了本文中的分析和主张。
更新日期:2021-09-13
down
wechat
bug