当前位置: X-MOL 学术J. Optim. Theory Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
General Convergence Analysis of Stochastic First-Order Methods for Composite Optimization
Journal of Optimization Theory and Applications ( IF 1.9 ) Pub Date : 2021-02-22 , DOI: 10.1007/s10957-021-01821-2
Ion Necoara

In this paper, we consider stochastic composite convex optimization problems with the objective function satisfying a stochastic bounded gradient condition, with or without a quadratic functional growth property. These models include the most well-known classes of objective functions analyzed in the literature: nonsmooth Lipschitz functions and composition of a (potentially) nonsmooth function and a smooth function, with or without strong convexity. Based on the flexibility offered by our optimization model, we consider several variants of stochastic first-order methods, such as the stochastic proximal gradient and the stochastic proximal point algorithms. Usually, the convergence theory for these methods has been derived for simple stochastic optimization models satisfying restrictive assumptions, and the rates are in general sublinear and hold only for specific decreasing stepsizes. Hence, we analyze the convergence rates of stochastic first-order methods with constant or variable stepsize under general assumptions covering a large class of objective functions. For constant stepsize, we show that these methods can achieve linear convergence rate up to a constant proportional to the stepsize and under some strong stochastic bounded gradient condition even pure linear convergence. Moreover, when a variable stepsize is chosen we derive sublinear convergence rates for these stochastic first-order methods. Finally, the stochastic gradient mapping and the Moreau smoothing mapping introduced in the present paper lead to simple and intuitive proofs.



中文翻译:

随机一阶组合优化方法的一般收敛性分析

在本文中,我们考虑目标函数满足随机有界梯度条件,具有或不具有二次函数增长性质的随机复合凸优化问题。这些模型包括文献中分析的最著名的目标函数类:不光滑的Lipschitz函数以及(可能)不光滑函数和光滑函数(具有或不具有强凸性)的组成。基于优化模型提供的灵活性,我们考虑了随机一阶方法的多种变体,例如随机近端梯度和随机近端点算法。通常,这些方法的收敛理论是针对满足限制性假设的简单随机优化模型得出的,速率通常是亚线性的,仅适用于特定的递减步长。因此,我们在覆盖大量目标函数的一般假设下,分析了具有恒定或可变步长的随机一阶方法的收敛速度。对于恒定步长,我们证明了这些方法可以达到线性收敛速率,直到与步长成比例的常数,并且在一些强随机有界梯度条件下,甚至是纯线性收敛。此外,当选择了变步长,我们推导出这些随机一阶方法次线性收敛速度。最后,本文中引入的随机梯度映射和Moreau平滑映射导致简单直观的证明。我们在覆盖一大类目标函数的一般假设下,分析了具有恒定步长或可变步长的随机一阶方法的收敛速度。对于恒定步长,我们证明了这些方法可以达到线性收敛速率,直到与步长成比例的常数,并且在一些强随机有界梯度条件下,甚至是纯线性收敛。此外,当选择了变步长,我们推导出这些随机一阶方法次线性收敛速度。最后,本文中引入的随机梯度映射和Moreau平滑映射导致简单直观的证明。在涵盖一大类目标函数的一般假设下,我们分析了具有恒定或可变步长的随机一阶方法的收敛速度。对于恒定步长,我们证明了这些方法可以达到线性收敛速率,直到与步长成比例的常数,并且在一些强随机有界梯度条件下,甚至是纯线性收敛。此外,当选择了变步长,我们推导出这些随机一阶方法次线性收敛速度。最后,本文中引入的随机梯度映射和Moreau平滑映射导致简单直观的证明。我们表明,这些方法可以实现线性收敛速度,直至达到与步长成比例的常数,并且在某些强随机有界梯度条件下,甚至是纯线性收敛。此外,当选择了变步长,我们推导出这些随机一阶方法次线性收敛速度。最后,本文中引入的随机梯度映射和Moreau平滑映射导致简单直观的证明。我们表明,这些方法可以实现线性收敛速度,直至达到与步长成比例的常数,并且在某些强随机有界梯度条件下,甚至是纯线性收敛。此外,当选择了变步长,我们推导出这些随机一阶方法次线性收敛速度。最后,本文中引入的随机梯度映射和Moreau平滑映射导致简单直观的证明。

更新日期:2021-02-22
down
wechat
bug