Journal of Complexity ( IF 1.8 ) Pub Date : 2019-09-27 , DOI: 10.1016/j.jco.2019.101438 Arnulf Jentzen , Philippe von Wurstemberger
The stochastic gradient descent (SGD) optimization algorithm is one of the central tools used to approximate solutions of stochastic optimization problems arising in machine learning and, in particular, deep learning applications. It is therefore important to analyze the convergence behavior of SGD. In this article we consider a simple quadratic stochastic optimization problem and establish for every essentially matching lower and upper bounds for the mean square error of the associated SGD process with learning rates . This allows us to precisely quantify the mean square convergence rate of the SGD method in dependence on the choice of the learning rates.
中文翻译:
随机梯度下降优化算法的较低误差范围:急剧收敛的速率,可实现缓慢和快速衰减的学习速率
随机梯度下降(SGD)优化算法是用于逼近机器学习,尤其是深度学习应用中出现的随机优化问题的解决方案的中央工具之一。因此,重要的是分析SGD的收敛行为。在本文中,我们考虑一个简单的二次随机优化问题,并为每个 实质上将相关SGD过程的均方误差的上下限与学习率匹配 。这使我们能够根据学习速率的选择精确地量化SGD方法的均方收敛速率。