当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ScissionLite: Accelerating Distributed Deep Neural Networks Using Transfer Layer
arXiv - CS - Artificial Intelligence Pub Date : 2021-05-05 , DOI: arxiv-2105.02019
Hyunho Ahn, Munkyu Lee, Cheol-Ho Hong, Blesson Varghese

Industrial Internet of Things (IIoT) applications can benefit from leveraging edge computing. For example, applications underpinned by deep neural networks (DNN) models can be sliced and distributed across the IIoT device and the edge of the network for improving the overall performance of inference and for enhancing privacy of the input data, such as industrial product images. However, low network performance between IIoT devices and the edge is often a bottleneck. In this study, we develop ScissionLite, a holistic framework for accelerating distributed DNN inference using the Transfer Layer (TL). The TL is a traffic-aware layer inserted between the optimal slicing point of a DNN model slice in order to decrease the outbound network traffic without a significant accuracy drop. For the TL, we implement a new lightweight down/upsampling network for performance-limited IIoT devices. In ScissionLite, we develop ScissionTL, the Preprocessor, and the Offloader for end-to-end activities for deploying DNN slices with the TL. They decide the optimal slicing point of the DNN, prepare pre-trained DNN slices including the TL, and execute the DNN slices on an IIoT device and the edge. Employing the TL for the sliced DNN models has a negligible overhead. ScissionLite improves the inference latency by up to 16 and 2.8 times when compared to execution on the local device and an existing state-of-the-art model slicing approach respectively.

中文翻译:

ScissionLite:使用传输层加速分布式深度神经网络

工业物联网(IIoT)应用程序可以从利用边缘计算中受益。例如,深度神经网络(DNN)模型所支持的应用程序可以在IIoT设备和网络边缘进行切片和分布,以改善推理的整体性能并增强输入数据(例如工业产品图像)的隐私性。但是,IIoT设备与边缘之间的低网络性能通常是瓶颈。在这项研究中,我们开发了ScissionLite,这是一个使用传输层(TL)加速分布式DNN推理的整体框架。TL是在DNN模型切片的最佳切片点之间插入的流量感知层,目的是在不显着降低精度的情况下减少出站网络流量。对于TL,我们为性能受限的IIoT设备实施了新的轻量级下/上采样网络。在ScissionLite中,我们开发了ScissionTL,预处理器和卸载程序,用于端到端活动,以使用TL部署DNN切片。他们确定DNN的最佳切片点,准备包括TL在内的预训练DNN切片,并在IIoT设备和边缘上执行DNN切片。为切片DNN模型采用TL的开销可以忽略不计。与在本地设备上执行和现有的最新模型切片方法相比,ScissionLite可以将推理延迟分别提高16倍和2.8倍。他们确定DNN的最佳切片点,准备包括TL在内的预训练DNN切片,并在IIoT设备和边缘上执行DNN切片。为切片DNN模型采用TL的开销可以忽略不计。与在本地设备上执行和现有的最新模型切片方法相比,ScissionLite可以将推理延迟分别提高16倍和2.8倍。他们确定DNN的最佳切片点,准备包括TL在内的预训练DNN切片,并在IIoT设备和边缘上执行DNN切片。为切片DNN模型采用TL的开销可以忽略不计。与在本地设备上执行和现有的最新模型切片方法相比,ScissionLite可以将推理延迟分别提高16倍和2.8倍。
更新日期:2021-05-06
down
wechat
bug