当前位置: X-MOL 学术Optim. Methods Softw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Asynchronous variance-reduced block schemes for composite non-convex stochastic optimization: block-specific steplengths and adapted batch-sizes
Optimization Methods & Software ( IF 1.4 ) Pub Date : 2020-04-13 , DOI: 10.1080/10556788.2020.1746963
Jinlong Lei 1, 2 , Uday V. Shanbhag 3
Affiliation  

This work considers the minimization of a sum of an expectation-valued coordinate-wise smooth nonconvex function and a nonsmooth block-separable convex regularizer. We propose an asynchronous variance-reduced algorithm, where in each iteration, a single block is randomly chosen to update its estimates by a proximal variable sample-size stochastic gradient scheme, while the remaining blocks are kept invariant. Notably, each block employs a steplength relying on its block-specific Lipschitz constant while batch-sizes are updated as a function of the number of times that block is selected. We show that every limit point is a stationary point and establish the ergodic non-asymptotic rate O(1/K). Iteration and oracle complexity to obtain an ε-stationary point are shown to be O(1/ϵ) and O(1/ϵ2), respectively. Furthermore, under a proximal Polyak–Łojasiewicz condition with batch sizes increasing at a geometric rate, we prove that the suboptimality diminishes at a geometric rate, the optimal deterministic rate while iteration and oracle complexity to obtain an ε-optimal solution are O(ln(1/ϵ)) and O(1/ϵ)1+c with c 0. In the single block setting, we obtain the optimal oracle complexity O(1/ϵ). Finally, preliminary numerics suggest that the schemes compare well with competitors reliant on global Lipschitz constants.



中文翻译:

用于复合非凸随机优化的异步方差减少块方案:特定于块的步长和适应的批量大小

这项工作考虑了期望值坐标平滑非凸函数和非平滑块可分离凸正则化器之和的最小化。我们提出了一种异步方差减少算法,其中在每次迭代中,随机选择一个块以通过近端可变样本大小随机梯度方案更新其估计,而其余块保持不变。值得注意的是,每个块都采用依赖于其特定于块的 Lipschitz 常数的步长,而批量大小则根据该块被选择的次数进行更新。我们证明了每个极限点都是一个静止点,并建立了遍历非渐近率(1/ķ). 迭代和预言复杂度来获得一个ε -平稳点被证明是(1/ε)(1/ε2), 分别。此外,在批次大小以几何速率增加的近似 Polyak-Łojasiewicz 条件下,我们证明次优性以几何速率减小,迭代时的最佳确定性速率和获得ε最优解的预言复杂度是(ln(1/ε))(1/ε)1+CC 0. 在单块设置中,我们获得了最优的预言机复杂度(1/ε). 最后,初步数字表明,这些方案与依赖全局 Lipschitz 常数的竞争对手相比表现得很好。

更新日期:2020-04-13
down
wechat
bug