On Large-Cohort Training for Federated Learning,arXiv - CS - Distributed, Parallel, and Cluster Computing

当前位置： X-MOL 学术 › arXiv.cs.DC › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On Large-Cohort Training for Federated Learning
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2021-06-15 , DOI: arxiv-2106.07820
Zachary Charles, Zachary Garrett, Zhouyuan Huo, Sergei Shmulyian, Virginia Smith

Federated learning methods typically learn a model by iteratively sampling updates from a population of clients. In this work, we explore how the number of clients sampled at each round (the cohort size) impacts the quality of the learned model and the training dynamics of federated learning algorithms. Our work poses three fundamental questions. First, what challenges arise when trying to scale federated learning to larger cohorts? Second, what parallels exist between cohort sizes in federated learning and batch sizes in centralized learning? Last, how can we design federated learning methods that effectively utilize larger cohort sizes? We give partial answers to these questions based on extensive empirical evaluation. Our work highlights a number of challenges stemming from the use of larger cohorts. While some of these (such as generalization issues and diminishing returns) are analogs of large-batch training challenges, others (including training failures and fairness concerns) are unique to federated learning.

中文翻译：

联邦学习的大队列训练

联邦学习方法通常通过从一群客户端中迭代地采样更新来学习模型。在这项工作中，我们探索了每轮采样的客户数量（队列大小）如何影响学习模型的质量和联邦学习算法的训练动态。我们的工作提出了三个基本问题。首先，在尝试将联邦学习扩展到更大的队列时会出现哪些挑战？其次，联邦学习中的队列大小与集中学习中的批量大小之间存在什么相似之处？最后，我们如何设计有效利用更大队列规模的联邦学习方法？我们基于广泛的经验评估对这些问题给出了部分答案。我们的工作强调了使用更大的队列所带来的一些挑战。

更新日期：2021-06-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>