Modeling of Deep Neural Network (DNN) Placement and Inference in Edge Computing,arXiv - CS - Networking and Internet Architecture

当前位置： X-MOL 学术 › arXiv.cs.NI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Modeling of Deep Neural Network (DNN) Placement and Inference in Edge Computing
arXiv - CS - Networking and Internet Architecture Pub Date : 2020-01-19 , DOI: arxiv-2001.06901
Mounir Bensalem, Jasenka Dizdarevi\'c and Admela Jukan

With the edge computing becoming an increasingly adopted concept in system architectures, it is expected its utilization will be additionally heightened when combined with deep learning (DL) techniques. The idea behind integrating demanding processing algorithms in Internet of Things (IoT) and edge devices, such as Deep Neural Network (DNN), has in large measure benefited from the development of edge computing hardware, as well as from adapting the algorithms for use in resource constrained IoT devices. Surprisingly, there are no models yet to optimally place and use machine learning in edge computing. In this paper, we propose the first model of optimal placement of Deep Neural Network (DNN) Placement and Inference in edge computing. We present a mathematical formulation to the DNN Model Variant Selection and Placement (MVSP) problem considering the inference latency of different model-variants, communication latency between nodes, and utilization cost of edge computing nodes. We evaluate our model numerically, and show that for low load increasing model co-location decreases the average latency by 33% of millisecond-scale per request, and for high load, by 21%.

中文翻译：

边缘计算中深度神经网络 (DNN) 布局和推理的建模

随着边缘计算在系统架构中越来越被采用，预计与深度学习 (DL) 技术相结合时，其利用率将进一步提高。在物联网 (IoT) 和边缘设备（例如深度神经网络 (DNN)）中集成苛刻的处理算法背后的想法在很大程度上受益于边缘计算硬件的发展，以及适应算法以用于资源受限的物联网设备。令人惊讶的是，目前还没有模型可以在边缘计算中优化放置和使用机器学习。在本文中，我们提出了边缘计算中深度神经网络 (DNN) 放置和推理的第一个优化放置模型。考虑到不同模型变体的推理延迟、节点之间的通信延迟和边缘计算节点的使用成本，我们提出了 DNN 模型变体选择和放置 (MVSP) 问题的数学公式。我们对我们的模型进行了数值评估，并表明对于低负载增加模型协同定位将每个请求的平均延迟降低了 33% 毫秒级，而对于高负载，则降低了 21%。

更新日期：2020-03-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文