Towards Efficient Scheduling of Federated Mobile Devices under Computational and Statistical Heterogeneity,IEEE Transactions on Parallel and Distributed Systems

当前位置： X-MOL 学术 › IEEE Trans. Parallel Distrib. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Towards Efficient Scheduling of Federated Mobile Devices under Computational and Statistical Heterogeneity
IEEE Transactions on Parallel and Distributed Systems ( IF 5.6 ) Pub Date : 2021-02-01 , DOI: 10.1109/tpds.2020.3023905
Cong Wang , Yuanyuan Yang , Pengzhan Zhou

Originated from distributed learning, federated learning enables privacy-preserved collaboration on a new abstracted level by sharing the model parameters only. While the current research mainly focuses on optimizing learning algorithms and minimizing communication overhead left by distributed learning, there is still a considerable gap when it comes to the real implementation on mobile devices. In this article, we start with an empirical experiment to demonstrate computation heterogeneity is a more pronounced bottleneck than communication on the current generation of battery-powered mobile devices, and the existing methods are haunted by mobile stragglers. Further, non-identically distributed data across the mobile users makes the selection of participants critical to the accuracy and convergence. To tackle the computational and statistical heterogeneity, we utilize data as a tuning knob and propose two efficient polynomial-time algorithms to schedule different workloads on various mobile devices, when data is identically or non-identically distributed. For identically distributed data, we combine partitioning and linear bottleneck assignment to achieve near-optimal training time without accuracy loss. For non-identically distributed data, we convert it into an average cost minimization problem and propose a greedy algorithm to find a reasonable balance between computation time and accuracy. We also establish an offline profiler to quantify the runtime behavior of different devices, which serves as the input to the scheduling algorithms. We conduct extensive experiments on a mobile testbed with two datasets and up to 20 devices. Compared with the common benchmarks, the proposed algorithms achieve 2-100× speedup epoch-wise, 2–7 percent accuracy gain and boost the convergence rate by more than 100 percent on CIFAR10.

中文翻译：

在计算和统计异质性下实现联合移动设备的高效调度

联邦学习起源于分布式学习，通过仅共享模型参数，在新的抽象级别上实现了隐私保护的协作。虽然目前的研究主要集中在优化学习算法和最小化分布式学习留下的通信开销上，但在移动设备上的实际实现方面仍有相当大的差距。在本文中，我们从一个实证实验开始，以证明计算异质性是比当前一代电池供电移动设备上的通信更明显的瓶颈，并且现有方法受到移动落后者的困扰。此外，跨移动用户的不同分布的数据使得参与者的选择对准确性和收敛性至关重要。为了解决计算和统计的异质性，我们利用数据作为调整旋钮，并提出了两种有效的多项式时间算法，以在数据相同或不同的情况下在各种移动设备上调度不同的工作负载。对于同分布的数据，我们结合分区和线性瓶颈分配来实现接近最优的训练时间而不会损失精度。对于非相同分布的数据，我们将其转换为平均成本最小化问题，并提出一种贪心算法，以在计算时间和准确性之间找到合理的平衡点。我们还建立了一个离线分析器来量化不同设备的运行时行为，作为调度算法的输入。我们在具有两个数据集和多达 20 台设备的移动测试台上进行了大量实验。

更新日期：2021-02-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11