当前位置: X-MOL 学术arXiv.cs.NA › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sub-linear convergence of a tamed stochastic gradient descent method in Hilbert space
arXiv - CS - Numerical Analysis Pub Date : 2021-06-17 , DOI: arxiv-2106.09286
Monika Eisenmann, Tony Stillfjord

In this paper, we introduce the tamed stochastic gradient descent method (TSGD) for optimization problems. Inspired by the tamed Euler scheme, which is a commonly used method within the context of stochastic differential equations, TSGD is an explicit scheme that exhibits stability properties similar to those of implicit schemes. As its computational cost is essentially equivalent to that of the well-known stochastic gradient descent method (SGD), it constitutes a very competitive alternative to such methods. We rigorously prove (optimal) sub-linear convergence of the scheme for strongly convex objective functions on an abstract Hilbert space. The analysis only requires very mild step size restrictions, which illustrates the good stability properties. The analysis is based on a priori estimates more frequently encountered in a time integration context than in optimization, and this alternative approach provides a different perspective also on the convergence of SGD. Finally, we demonstrate the usability of the scheme on a problem arising in a context of supervised learning.

中文翻译:

希尔伯特空间中驯服的随机梯度下降方法的亚线性收敛

在本文中,我们介绍了用于优化问题的驯服随机梯度下降法(TSGD)。受驯服的欧拉方案(随机微分方程上下文中常用的方法)的启发,TSGD 是一种显式方案,其表现出类似于隐式方案的稳定性属性。由于其计算成本基本上等同于众所周知的随机梯度下降法 (SGD),因此它构成了此类方法的极具竞争力的替代方案。我们在抽象希尔伯特空间上严格证明了强凸目标函数方案的(最优)次线性收敛。分析只需要非常温和的步长限制,这说明了良好的稳定性特性。该分析基于在时间积分上下文中比在优化中更频繁遇到的先验估计,并且这种替代方法也提供了关于 SGD 收敛的不同视角。最后,我们证明了该方案在监督学习环境中出现的问题的可用性。
更新日期:2021-06-18
down
wechat
bug