当前位置:
X-MOL 学术
›
arXiv.cs.NA
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Weak error analysis for stochastic gradient descent optimization algorithms
arXiv - CS - Numerical Analysis Pub Date : 2020-07-03 , DOI: arxiv-2007.02723 Aritz Bercher, Lukas Gonon, Arnulf Jentzen, Diyora Salimova
arXiv - CS - Numerical Analysis Pub Date : 2020-07-03 , DOI: arxiv-2007.02723 Aritz Bercher, Lukas Gonon, Arnulf Jentzen, Diyora Salimova
Stochastic gradient descent (SGD) type optimization schemes are fundamental
ingredients in a large number of machine learning based algorithms. In
particular, SGD type optimization schemes are frequently employed in
applications involving natural language processing, object and face
recognition, fraud detection, computational advertisement, and numerical
approximations of partial differential equations. In mathematical convergence
results for SGD type optimization schemes there are usually two types of error
criteria studied in the scientific literature, that is, the error in the strong
sense and the error with respect to the objective function. In applications one
is often not only interested in the size of the error with respect to the
objective function but also in the size of the error with respect to a test
function which is possibly different from the objective function. The analysis
of the size of this error is the subject of this article. In particular, the
main result of this article proves under suitable assumptions that the size of
this error decays at the same speed as in the special case where the test
function coincides with the objective function.
中文翻译:
随机梯度下降优化算法的弱误差分析
随机梯度下降 (SGD) 类型的优化方案是大量基于机器学习的算法的基本要素。特别是,SGD 类型的优化方案经常用于涉及自然语言处理、对象和人脸识别、欺诈检测、计算广告和偏微分方程数值近似的应用中。在 SGD 类型优化方案的数学收敛结果中,科学文献中研究的误差准则通常有两种,即强意义误差和相对于目标函数的误差。在应用中,人们通常不仅对目标函数的误差大小感兴趣,而且对可能与目标函数不同的测试函数的误差大小感兴趣。分析这个误差的大小是本文的主题。特别是,本文的主要结果在适当的假设下证明了该误差的大小以与测试函数与目标函数重合的特殊情况相同的速度衰减。
更新日期:2020-07-22
中文翻译:
随机梯度下降优化算法的弱误差分析
随机梯度下降 (SGD) 类型的优化方案是大量基于机器学习的算法的基本要素。特别是,SGD 类型的优化方案经常用于涉及自然语言处理、对象和人脸识别、欺诈检测、计算广告和偏微分方程数值近似的应用中。在 SGD 类型优化方案的数学收敛结果中,科学文献中研究的误差准则通常有两种,即强意义误差和相对于目标函数的误差。在应用中,人们通常不仅对目标函数的误差大小感兴趣,而且对可能与目标函数不同的测试函数的误差大小感兴趣。分析这个误差的大小是本文的主题。特别是,本文的主要结果在适当的假设下证明了该误差的大小以与测试函数与目标函数重合的特殊情况相同的速度衰减。