当前位置: X-MOL 学术IEEE J. Sel. Area. Comm. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Basil: A Fast and Byzantine-Resilient Approach for Decentralized Training
IEEE Journal on Selected Areas in Communications ( IF 16.4 ) Pub Date : 2022-07-15 , DOI: 10.1109/jsac.2022.3191347
Ahmed Roushdy Elkordy 1 , Saurav Prakash 1 , Salman Avestimehr 1
Affiliation  

Decentralized (i.e., serverless) training across edge nodes can suffer substantially from potential Byzantine nodes that can degrade the training performance. However, detection and mitigation of Byzantine behaviors in a decentralized learning setting is a daunting task, especially when the data distribution at the users is heterogeneous. As our main contribution, we propose Basil , a fast and computationally efficient Byzantine-robust algorithm for decentralized training systems, which leverages a novel sequential, memory-assisted and performance-based criteria for training over a logical ring while filtering the Byzantine users. In the IID dataset setting, we provide the theoretical convergence guarantees of Basil , demonstrating its linear convergence rate. Furthermore, for the IID setting, we experimentally demonstrate that Basil is robust to various Byzantine attacks, including the strong Hidden attack, while providing up to absolute ~16% higher test accuracy over the state-of-the-art Byzantine-resilient decentralized learning approach. Additionally, we generalize Basil to the non-IID setting by proposing Anonymous Cyclic Data Sharing (ACDS), a technique that allows each node to anonymously share a random fraction of its local non-sensitive dataset (e.g., landmarks images) with all other nodes. Finally, to reduce the overall latency of Basil resulting from its sequential implementation over the logical ring, we propose Basil+ that enables Byzantine-robust parallel training across groups of logical rings, and at the same time, it retains the performance gains of Basil due to sequential training within each group. Furthermore, we experimentally demonstrate the scalability gains of Basil+ through different sets of experiments.

中文翻译:

Basil:一种快速且具有拜占庭弹性的去中心化培训方法

跨边缘节点的去中心化(即无服务器)训练可能会受到潜在拜占庭节点的严重影响,这些节点可能会降低训练性能。然而,在去中心化学习环境中检测和缓解拜占庭行为是一项艰巨的任务,尤其是当用户的数据分布是异构的时。作为我们的主要贡献,我们建议罗勒 ,一种用于分散训练系统的快速且计算效率高的拜占庭鲁棒算法,它利用新颖的顺序、内存辅助和基于性能的标准在逻辑环上进行训练,同时过滤拜占庭用户。在 IID 数据集设置中,我们提供了理论收敛保证罗勒 ,证明了它的线性收敛速度。此外,对于 IID 设置,我们通过实验证明罗勒对各种拜占庭攻击(包括强大的隐藏攻击)具有鲁棒性,同时提供比最先进的拜占庭弹性分散式学习方法高出约 16% 的绝对测试准确度。此外,我们概括罗勒通过提出匿名循环数据共享 (ACDS) 到非 IID 设置,该技术允许每个节点与所有其他节点匿名共享其本地非敏感数据集(例如,地标图像)的随机部分。最后,为了减少整体延迟罗勒由于其在逻辑环上的顺序实现,我们建议罗勒+这使得跨逻辑环组的拜占庭鲁棒并行训练,同时,它保留了性能增益罗勒由于每个组内的顺序训练。此外,我们通过实验证明了可扩展性增益罗勒+通过不同组的实验。
更新日期:2022-07-15
down
wechat
bug