当前位置:
X-MOL 学术
›
arXiv.cs.DC
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Befriending The Byzantines Through Reputation Scores
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2020-06-24 , DOI: arxiv-2006.13421 Jayanth Regatti and Abhishek Gupta
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2020-06-24 , DOI: arxiv-2006.13421 Jayanth Regatti and Abhishek Gupta
We propose two novel stochastic gradient descent algorithms, ByGARS and
ByGARS++, for distributed machine learning in the presence of Byzantine
adversaries. In these algorithms, reputation score of workers are computed
using an auxiliary dataset with a larger stepsize. This reputation score is
then used for aggregating the gradients for stochastic gradient descent with a
smaller stepsize. We show that using these reputation scores for gradient
aggregation is robust to any number of Byzantine adversaries. In contrast to
prior works targeting any number of adversaries, we improve the generalization
performance by making use of some adversarial workers along with the benign
ones. The computational complexity of ByGARS++ is the same as the usual
stochastic gradient descent method with only an additional inner product
computation. We establish its convergence for strongly convex loss functions
and demonstrate the effectiveness of the algorithms for non-convex learning
problems using MNIST and CIFAR-10 datasets.
中文翻译:
通过声誉分数与拜占庭人交朋友
我们提出了两种新颖的随机梯度下降算法 ByGARS 和 ByGARS++,用于在拜占庭对手存在的情况下进行分布式机器学习。在这些算法中,工作人员的声誉分数是使用具有更大步长的辅助数据集计算的。然后,该声誉分数用于聚合梯度,以较小的步长进行随机梯度下降。我们表明,使用这些声誉分数进行梯度聚合对任何数量的拜占庭对手都是稳健的。与之前针对任意数量对手的工作相比,我们通过使用一些对抗性工作者和良性工作者来提高泛化性能。ByGARS++ 的计算复杂度与通常的随机梯度下降方法相同,只是增加了一个内积计算。
更新日期:2020-06-25
中文翻译:
通过声誉分数与拜占庭人交朋友
我们提出了两种新颖的随机梯度下降算法 ByGARS 和 ByGARS++,用于在拜占庭对手存在的情况下进行分布式机器学习。在这些算法中,工作人员的声誉分数是使用具有更大步长的辅助数据集计算的。然后,该声誉分数用于聚合梯度,以较小的步长进行随机梯度下降。我们表明,使用这些声誉分数进行梯度聚合对任何数量的拜占庭对手都是稳健的。与之前针对任意数量对手的工作相比,我们通过使用一些对抗性工作者和良性工作者来提高泛化性能。ByGARS++ 的计算复杂度与通常的随机梯度下降方法相同,只是增加了一个内积计算。