当前位置: X-MOL 学术IEEE J. Sel. Area. Comm. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accelerating DNN Training in Wireless Federated Edge Learning Systems
IEEE Journal on Selected Areas in Communications ( IF 13.8 ) Pub Date : 2021-01-01 , DOI: 10.1109/jsac.2020.3036971
Jinke Ren , Guanding Yu , Guangyao Ding

Training task in classical machine learning models, such as deep neural networks, is generally implemented at a remote cloud center for centralized learning, which is typically time-consuming and resource-hungry. It also incurs serious privacy issue and long communication latency since a large amount of data are transmitted to the centralized node. To overcome these shortcomings, we consider a newly-emerged framework, namely federated edge learning, to aggregate local learning updates at the network edge in lieu of users’ raw data. Aiming at accelerating the training process, we first define a novel performance evaluation criterion, called learning efficiency. We then formulate a training acceleration optimization problem in the CPU scenario, where each user device is equipped with CPU. The closed-form expressions for joint batchsize selection and communication resource allocation are developed and some insightful results are highlighted. Further, we extend our learning framework to the GPU scenario. The optimal solution in this scenario is manifested to have the similar structure as that of the CPU scenario, recommending that our proposed algorithm is applicable in more general systems. Finally, extensive experiments validate the theoretical analysis and demonstrate that the proposed algorithm can reduce the training time and improve the learning accuracy simultaneously.

中文翻译:

加速无线联合边缘学习系统中的 DNN 训练

经典机器学习模型(例如深度神经网络)中的训练任务通常在远程云中心实施以进行集中学习,这通常既耗时又耗资源。由于大量数据被传输到中心节点,它还带来了严重的隐私问题和较长的通信延迟。为了克服这些缺点,我们考虑了一个新出现的框架,即联合边缘学习,在网络边缘聚合本地学习更新,而不是用户的原始数据。为了加速训练过程,我们首先定义了一个新的绩效评估标准,称为学习效率。然后我们在每个用户设备都配备了 CPU 的 CPU 场景中制定了一个训练加速优化问题。开发了用于联合批量大小选择和通信资源分配的封闭式表达式,并突出了一些有见地的结果。此外,我们将学习框架扩展到 GPU 场景。该场景中的最优解表现出与CPU场景具有相似的结构,建议我们提出的算法适用于更通用的系统。最后,大量实验验证了理论分析,并证明所提出的算法可以同时减少训练时间和提高学习精度。该场景中的最优解表现出与CPU场景具有相似的结构,建议我们提出的算法适用于更通用的系统。最后,大量实验验证了理论分析,并证明所提出的算法可以同时减少训练时间和提高学习精度。该场景中的最优解表现出与CPU场景具有相似的结构,建议我们提出的算法适用于更通用的系统。最后,大量实验验证了理论分析,并证明所提出的算法可以同时减少训练时间和提高学习精度。
更新日期:2021-01-01
down
wechat
bug