当前位置: X-MOL 学术ACM Trans. Des. Autom. Electron. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning Level
ACM Transactions on Design Automation of Electronic Systems ( IF 2.2 ) Pub Date : 2021-08-01 , DOI: 10.1145/3460972
Mohammad-Ali Maleki 1 , Alireza Nabipour-Meybodi 1 , Mehdi Kamal 1 , Ali Afzali-Kusha 2 , Massoud Pedram 3
Affiliation  

In this article, we present a low-energy inference method for convolutional neural networks in image classification applications. The lower energy consumption is achieved by using a highly pruned (lower-energy) network if the resulting network can provide a correct output. More specifically, the proposed inference method makes use of two pruned neural networks (NNs), namely mildly and aggressively pruned networks, which are both designed offline. In the system, a third NN makes use of the input data for the online selection of the appropriate pruned network. The third network, for its feature extraction, employs the same convolutional layers as those of the aggressively pruned NN, thereby reducing the overhead of the online management. There is some accuracy loss induced by the proposed method where, for a given level of accuracy, the energy gain of the proposed method is considerably larger than the case of employing any one pruning level. The proposed method is independent of both the pruning method and the network architecture. The efficacy of the proposed inference method is assessed on Eyeriss hardware accelerator platform for some of the state-of-the-art NN architectures. Our studies show that this method may provide, on average, 70% energy reduction compared to the original NN at the cost of about 3% accuracy loss on the CIFAR-10 dataset.

中文翻译:

一种基于剪枝水平动态调整的卷积神经网络节能推理方法

在本文中,我们提出了一种用于图像分类应用中的卷积神经网络的低能量推理方法。如果生成的网络可以提供正确的输出,则可以通过使用高度修剪(低能耗)的网络来实现较低的能耗。更具体地说,所提出的推理方法利用了两个修剪过的神经网络 (NN),即温和修剪和积极修剪网络,它们都是离线设计的。在系统中,第三个神经网络利用输入数据在线选择适当的修剪网络。第三个网络,用于特征提取,采用与积极修剪的 NN 相同的卷积层,从而减少在线管理的开销。所提出的方法会引起一些精度损失,其中,对于给定的精度水平,所提出的方法的能量增益比采用任何一种修剪水平的情况要大得多。所提出的方法独立于剪枝方法和网络架构。针对某些最先进的 NN 架构,在 Eyeriss 硬件加速器平台上评估了所提出的推理方法的功效。我们的研究表明,与原始 NN 相比,这种方法可以平均减少 70% 的能量,但代价是 CIFAR-10 数据集上的准确度损失约为 3%。针对某些最先进的 NN 架构,在 Eyeriss 硬件加速器平台上评估了所提出的推理方法的功效。我们的研究表明,与原始 NN 相比,这种方法可以平均减少 70% 的能量,但代价是 CIFAR-10 数据集上的准确度损失约为 3%。针对某些最先进的 NN 架构,在 Eyeriss 硬件加速器平台上评估了所提出的推理方法的功效。我们的研究表明,与原始 NN 相比,这种方法可以平均减少 70% 的能量,但代价是 CIFAR-10 数据集上的准确度损失约为 3%。
更新日期:2021-08-01
down
wechat
bug