当前位置: X-MOL 学术SIAM J. Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Stochastic Conditional Gradient++: (Non)Convex Minimization and Continuous Submodular Maximization
SIAM Journal on Optimization ( IF 2.6 ) Pub Date : 2020-12-14 , DOI: 10.1137/19m1304271
Hamed Hassani , Amin Karbasi , Aryan Mokhtari , Zebang Shen

SIAM Journal on Optimization, Volume 30, Issue 4, Page 3315-3344, January 2020.
In this paper, we consider the general nonoblivious stochastic optimization where the underlying stochasticity may change during the optimization procedure and depends on the point at which the function is evaluated. We develop Stochastic Frank--Wolfe++ (SFW++), an efficient variant of the conditional gradient method for minimizing a smooth nonconvex function subject to a convex body constraint. We show that SFW++ converges to an $\epsilon$-first order stationary point by using $O(1/\epsilon^3)$ stochastic gradients. Once further structures are present, SFW++'s theoretical guarantees, in terms of the convergence rate and quality of its solution, improve. In particular, for minimizing a convex function, SFW++ achieves an $\epsilon$-approximate optimum while using $O(1/\epsilon^2)$ stochastic gradients. It is known that this rate is optimal in terms of stochastic gradient evaluations. Similarly, for maximizing a monotone continuous DR-submodular function, a slightly different form of SFW++, called Stochastic Continuous Greedy++ (SCG++), achieves a tight $[(1-1/e){\text{OPT}} -\epsilon]$ solution while using $O(1/\epsilon^2)$ stochastic gradients. Through an information theoretic argument, we also prove that SCG++'s convergence rate is optimal. Finally, for maximizing a nonmonotone continuous DR-submodular function, we can achieve a $[(1/e){\text{OPT}} -\epsilon]$ solution by using $O(1/\epsilon^2)$ stochastic gradients. We should highlight that our results and our novel variance reduction technique trivially extend to the standard and easier oblivious stochastic optimization settings for (non)convex and continuous submodular settings.


中文翻译:

随机条件梯度++ :(非)凸最小化和连续亚模最大化

SIAM优化杂志,第30卷,第4期,第3315-3344页,2020年1月。
在本文中,我们考虑了一般的非显而易见的随机优化,其中潜在的随机性可能会在优化过程中发生变化,并取决于对函数进行评估的时间点。我们开发了随机Frank-Wolfe ++(SFW ++),这是条件梯度方法的一种有效变体,用于最小化受凸体约束的光滑非凸函数。通过使用$ O(1 / \ epsilon ^ 3)$随机梯度,我们证明SFW ++收敛到$ \ epsilon $一阶固定点。一旦提出了进一步的结构,就收敛速度和解决方案的质量而言,SFW ++的理论保证将得到改善。特别是,为了最小化凸函数,SFW ++在使用$ O(1 / \ epsilon ^ 2)$随机梯度的同时,实现了$ε近似最优值。已知该速率在随机梯度评估方面是最佳的。类似地,为了最大化单调连续DR子模函数,一种稍有不同的SFW ++形式(称为随​​机连续贪婪++(SCG ++))实现了严格的$ [(1-1 / e){\ text {OPT}}-\ epsilon]使用$ O(1 / \ epsilon ^ 2)$随机梯度时的$解。通过信息论的论证,我们还证明了SCG ++的收敛速度是最优的。最后,为了最大化非单调连续DR子模函数,我们可以使用$ O(1 / \ epsilon ^ 2)$随机数来实现$ [(1 / e){\ text {OPT}}-\ epsilon $渐变。我们应该强调的是,我们的结果和新颖的方差减少技术在适用于(非)凸和连续子模态设置的标准且更容易忽略的随机优化设置上得到了微不足道的扩展。
更新日期:2020-12-14
down
wechat
bug