Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing,IEEE Transactions on Wireless Communications

当前位置： X-MOL 学术 › IEEE Trans. Wirel. Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing
IEEE Transactions on Wireless Communications ( IF 8.9 ) Pub Date : 2020-01-01 , DOI: 10.1109/twc.2019.2946140
En Li , Liekang Zeng , Zhi Zhou , Xu Chen

As a key technology of enabling Artificial Intelligence (AI) applications in 5G era, Deep Neural Networks (DNNs) have quickly attracted widespread attention. However, it is challenging to run computation-intensive DNN-based tasks on mobile devices due to the limited computation resources. What’s worse, traditional cloud-assisted DNN inference is heavily hindered by the significant wide-area network latency, leading to poor real-time performance as well as low quality of user experience. To address these challenges, in this paper, we propose Edgent, a framework that leverages edge computing for DNN collaborative inference through device-edge synergy. Edgent exploits two design knobs: (1) DNN partitioning that adaptively partitions computation between device and edge for purpose of coordinating the powerful cloud resource and the proximal edge resource for real-time DNN inference; (2) DNN right-sizing that further reduces computing latency via early exiting inference at an appropriate intermediate DNN layer. In addition, considering the potential network fluctuation in real-world deployment, Edgent is properly design to specialize for both static and dynamic network environment. Specifically, in a static environment where the bandwidth changes slowly, Edgent derives the best configurations with the assist of regression-based prediction models, while in a dynamic environment where the bandwidth varies dramatically, Edgent generates the best execution plan through the online change point detection algorithm that maps the current bandwidth state to the optimal configuration. We implement Edgent prototype based on the Raspberry Pi and the desktop PC and the extensive experimental evaluations demonstrate Edgent’s effectiveness in enabling on-demand low-latency edge intelligence.

中文翻译：

边缘人工智能：通过边缘计算按需加速深度神经网络推理

深度神经网络（DNN）作为5G时代使能人工智能（AI）应用的关键技术，迅速引起了广泛关注。然而，由于计算资源有限，在移动设备上运行基于计算密集型 DNN 的任务具有挑战性。更糟糕的是，传统的云辅助DNN推理受到广域网显着延迟的严重阻碍，导致实时性能差以及用户体验质量低下。为了应对这些挑战，在本文中，我们提出了 Edgent，这是一个通过设备边缘协同利用边缘计算进行 DNN 协同推理的框架。Edgent 利用了两个设计旋钮：(1) DNN 划分，自适应划分设备和边缘之间的计算，以协调强大的云资源和近端边缘资源进行实时 DNN 推理；(2) DNN 调整大小，通过在适当的中间 DNN 层提前退出推理进一步减少计算延迟。此外，考虑到实际部署中潜在的网络波动，Edgent 设计合理，专用于静态和动态网络环境。具体来说，在带宽变化缓慢的静态环境中，Edgent 借助基于回归的预测模型得出最佳配置，而在带宽变化剧烈的动态环境中，Edgent 通过在线变化点检测算法生成最佳执行计划，将当前带宽状态映射到最佳配置。我们实现了基于 Raspberry Pi 和台式机的 Edgent 原型，广泛的实验评估证明了 Edgent 在实现按需低延迟边缘智能方面的有效性。

更新日期：2020-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11