当前位置: X-MOL 学术arXiv.cs.NE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Byzantine-Resilient Non-Convex Stochastic Gradient Descent
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2020-12-28 , DOI: arxiv-2012.14368
Zeyuan Allen-Zhu, Faeze Ebrahimian, Jerry Li, Dan Alistarh

We study adversary-resilient stochastic distributed optimization, in which $m$ machines can independently compute stochastic gradients, and cooperate to jointly optimize over their local objective functions. However, an $\alpha$-fraction of the machines are $\textit{Byzantine}$, in that they may behave in arbitrary, adversarial ways. We consider a variant of this procedure in the challenging $\textit{non-convex}$ case. Our main result is a new algorithm SafeguardSGD which can provably escape saddle points and find approximate local minima of the non-convex objective. The algorithm is based on a new concentration filtering technique, and its sample and time complexity bounds match the best known theoretical bounds in the stochastic, distributed setting when no Byzantine machines are present. Our algorithm is practical: it improves upon the performance of prior methods when training deep neural networks, it is relatively lightweight, and is the first method to withstand two recently-proposed Byzantine attacks.

中文翻译:

拜占庭弹性非凸随机梯度下降

我们研究对手弹性随机分布优化,其中$ m $机器可以独立计算随机梯度,并合作共同优化其局部目标函数。但是,机器的$ \ alpha $分数是$ \ textit {Byzantine} $,因为它们可能以任意,对抗的方式运行。在具有挑战性的$ \ textit {non-convex} $案例中,我们考虑了此过程的一种变体。我们的主要结果是一种新的算法SafeguardSGD,该算法可以证明可逃避鞍点并找到非凸物镜的近似局部最小值。该算法基于一种新的浓度过滤技术,当不存在拜占庭式机器时,其样本和时间复杂度范围与随机分布设置中最知名的理论范围相匹配。我们的算法很实用:
更新日期:2020-12-29
down
wechat
bug