当前位置: X-MOL 学术Comput. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accelerating on-device DNN inference during service outage through scheduling early exit
Computer Communications ( IF 4.5 ) Pub Date : 2020-08-13 , DOI: 10.1016/j.comcom.2020.08.005
Zizhao Wang , Wei Bao , Dong Yuan , Liming Ge , Nguyen H. Tran , Albert Y. Zomaya

In recent years, the rapid development of edge computing enables us to process a wide variety of intelligent applications at the edge, such as real-time video analytics. However, edge computing could suffer from service outage caused by the fluctuated wireless connection or congested computing resource. During the service outage, the only choice is to process the deep neural network (DNN) inference at the local mobile devices. The obstacle is that due to the limited resource, it may not be possible to complete inference tasks on time. Inspired by the recently developed early exit of DNNs, where we can exit DNN at earlier layers to shorten the inference delay by sacrificing an acceptable level of accuracy, we propose to adopt such mechanism to process inference tasks during the service outage. The challenge is how to obtain the optimal schedule with diverse early exit choices. To this end, we formulate an optimal scheduling problem with the objective to maximize a general overall utility. However, the problem is in the form of integer programming, which cannot be solved by a standard approach. We therefore prove the Ordered Scheduling structure, indicating that a frame arrived earlier must be scheduled earlier. Such structure greatly decreases the searching space for an optimal solution. Then, we propose the Scheduling Early Exit (SEE) algorithm based on dynamic programming, to solve the problem optimally with polynomial computational complexity. Finally, we conduct trace-driven simulations and real-world experiment to compare SEE with two benchmarks. The result shows that the utility gain of SEE can outperform the benchmarks by 50.9% in the simulation and by 57.79% in the real-world experiment.



中文翻译:

通过安排提前退出,在服务中断期间加速设备上DNN推理

近年来,边缘计算的飞速发展使我们能够在边缘处理大量智能应用,例如实时视频分析。但是,边缘计算可能会因无线连接波动或计算资源拥塞而导致服务中断。在服务中断期间,唯一的选择是在本地移动设备上处理深度神经网络(DNN)推理。障碍是由于资源有限,可能无法按时完成推理任务。受到最近开发的早期退出的启发对于DNN,我们可以在较早的层级退出DNN,以通过牺牲可接受的准确性水平来缩短推理延迟,我们建议采用这种机制在服务中断期间处理推理任务。面临的挑战是如何通过各种提前退出选择来获得最佳计划。为此,我们制定了一个优化调度问题,其目的是最大化一般的总体效用。但是,问题是整数编程的形式,无法通过标准方法解决。因此,我们证明了有序调度结构,表明必须更早地调度到达较早的帧。这种结构极大地减少了寻找最佳解决方案的搜索空间。然后,我们提出了基于动态规划的调度早期退出(SEE)算法,用多项式计算复杂度最优地解决问题。最后,我们进行跟踪驱动的仿真和实际实验,以将SEE与两个基准进行比较。结果表明,在仿真中,SEE的效用增益可以比基准高50.9%,而在实际实验中,则可以高出57.79%。

更新日期:2020-08-28
down
wechat
bug