当前位置: X-MOL 学术Comput. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A deep neural network compression algorithm based on knowledge transfer for edge devices
Computer Communications ( IF 4.5 ) Pub Date : 2020-09-28 , DOI: 10.1016/j.comcom.2020.09.016
Yanming Chen , Chao Li , Luqi Gong , Xiang Wen , Yiwen Zhang , Weisong Shi

The computation and storage capacity of the edge device are limited, which seriously restrict the application of deep neural network in the device. Toward to the intelligent application of the edge device, we introduce the deep neural network compression algorithm based on knowledge transfer, a three-stage pipeline: lightweight, multi-level knowledge transfer and pruning that reduce the network depth, parameter and operation complexity of the deep learning neural networks. We lighten the neural networks by using a global average pooling layer instead of a fully connected layer and replacing a standard convolution with separable convolutions. Next, the multi-level knowledge transfer minimizes the difference between the output of the ”student network” and the ”teacher network” in the middle and logits layer, increasing the supervised information when training the ”student network”. Lastly, we prune the network by cutting off the unimportant convolution kernels with a global iterative pruning strategy. The experiment results show that the proposed method improve the efficiency up to 30% than the knowledge distillation method in reducing the loss of classification performance. Benchmarked on GPU (Graphics Processing Unit) server, Raspberry Pi 3 and Cambricon-1A, the parameters of the compressed network after using our knowledge transfer and pruning method have achieved more than 49.5 times compression and the time efficiency of a single feedforward operation has been improved more than 3.2 times.



中文翻译:

基于知识转移的边缘设备深度神经网络压缩算法

边缘设备的计算和存储能力有限,严重限制了深度神经网络在设备中的应用。针对边缘设备的智能应用,我们介绍了基于知识转移的深度神经网络压缩算法,这是一个三级管道:轻量级,多级知识转移和修剪,可减少网络深度,参数和操作复杂性。深度学习神经网络。我们通过使用全局平均池化层而不是完全连接的层并用可分离的卷积代替标准卷积来减轻神经网络的负担。接下来,多级知识转移将中间层和逻辑层的“学生网络”和“教师网络”的输出之间的差异最小化,在训练“学生网络”时增加监督信息。最后,我们通过使用全局迭代修剪策略切断不重要的卷积内核来修剪网络。实验结果表明,所提出的方法在减少分类性能损失方面比知识蒸馏方法提高了30%。在GPU(图形处理单元)服务器,Raspberry Pi 3和Cambricon-1A上进行了基准测试,使用我们的知识转移和修剪方法后,压缩网络的参数已实现了49.5倍以上的压缩,并且一次前馈操作的时间效率达到了改善了3.2倍以上 实验结果表明,所提出的方法在减少分类性能损失方面比知识蒸馏方法提高了30%。在GPU(图形处理单元)服务器,Raspberry Pi 3和Cambricon-1A上进行了基准测试,使用我们的知识转移和修剪方法后,压缩网络的参数已实现了49.5倍以上的压缩,并且一次前馈操作的时间效率达到了改善了3.2倍以上 实验结果表明,所提出的方法在减少分类性能损失方面比知识蒸馏方法提高了30%。在GPU(图形处理单元)服务器,Raspberry Pi 3和Cambricon-1A上进行了基准测试,使用我们的知识转移和修剪方法后,压缩网络的参数已实现了49.5倍以上的压缩,并且一次前馈操作的时间效率达到了改善了3.2倍以上

更新日期:2020-10-02
down
wechat
bug