当前位置:
X-MOL 学术
›
arXiv.cs.NA
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Sub-linear convergence of a tamed stochastic gradient descent method in Hilbert space
arXiv - CS - Numerical Analysis Pub Date : 2021-06-17 , DOI: arxiv-2106.09286 Monika Eisenmann, Tony Stillfjord
arXiv - CS - Numerical Analysis Pub Date : 2021-06-17 , DOI: arxiv-2106.09286 Monika Eisenmann, Tony Stillfjord
In this paper, we introduce the tamed stochastic gradient descent method
(TSGD) for optimization problems. Inspired by the tamed Euler scheme, which is
a commonly used method within the context of stochastic differential equations,
TSGD is an explicit scheme that exhibits stability properties similar to those
of implicit schemes. As its computational cost is essentially equivalent to
that of the well-known stochastic gradient descent method (SGD), it constitutes
a very competitive alternative to such methods. We rigorously prove (optimal) sub-linear convergence of the scheme for
strongly convex objective functions on an abstract Hilbert space. The analysis
only requires very mild step size restrictions, which illustrates the good
stability properties. The analysis is based on a priori estimates more
frequently encountered in a time integration context than in optimization, and
this alternative approach provides a different perspective also on the
convergence of SGD. Finally, we demonstrate the usability of the scheme on a
problem arising in a context of supervised learning.
中文翻译:
希尔伯特空间中驯服的随机梯度下降方法的亚线性收敛
在本文中,我们介绍了用于优化问题的驯服随机梯度下降法(TSGD)。受驯服的欧拉方案(随机微分方程上下文中常用的方法)的启发,TSGD 是一种显式方案,其表现出类似于隐式方案的稳定性属性。由于其计算成本基本上等同于众所周知的随机梯度下降法 (SGD),因此它构成了此类方法的极具竞争力的替代方案。我们在抽象希尔伯特空间上严格证明了强凸目标函数方案的(最优)次线性收敛。分析只需要非常温和的步长限制,这说明了良好的稳定性特性。该分析基于在时间积分上下文中比在优化中更频繁遇到的先验估计,并且这种替代方法也提供了关于 SGD 收敛的不同视角。最后,我们证明了该方案在监督学习环境中出现的问题的可用性。
更新日期:2021-06-18
中文翻译:
希尔伯特空间中驯服的随机梯度下降方法的亚线性收敛
在本文中,我们介绍了用于优化问题的驯服随机梯度下降法(TSGD)。受驯服的欧拉方案(随机微分方程上下文中常用的方法)的启发,TSGD 是一种显式方案,其表现出类似于隐式方案的稳定性属性。由于其计算成本基本上等同于众所周知的随机梯度下降法 (SGD),因此它构成了此类方法的极具竞争力的替代方案。我们在抽象希尔伯特空间上严格证明了强凸目标函数方案的(最优)次线性收敛。分析只需要非常温和的步长限制,这说明了良好的稳定性特性。该分析基于在时间积分上下文中比在优化中更频繁遇到的先验估计,并且这种替代方法也提供了关于 SGD 收敛的不同视角。最后,我们证明了该方案在监督学习环境中出现的问题的可用性。