当前位置: X-MOL 学术IEEE Trans. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks
IEEE Transactions on Computers ( IF 3.6 ) Pub Date : 2020-01-01 , DOI: 10.1109/tc.2020.2972520
Souvik Kundu , Mahdi Nazemi , Massoud Pedram , Keith M. Chugg , Peter Beeral

The high energy cost of processing deep convolutional neural networks impedes their ubiquitous deployment in energy-constrained platforms such as embedded systems and IoT devices. This article introduces convolutional layers with pre-defined sparse 2D kernels that have support sets that repeat periodically within and across filters. Due to the efficient storage of our periodic sparse kernels, the parameter savings can translate into considerable improvements in energy efficiency due to reduced DRAM accesses, thus promising significant improvements in the trade-off between energy consumption and accuracy for both training and inference. To evaluate this approach, we performed experiments with two widely accepted datasets, CIFAR-10 and Tiny ImageNet in sparse variants of the ResNet18 and VGG16 architectures. Compared to baseline models, our proposed sparse variants require up to $\mathord {\sim }82\%$82% fewer model parameters with $5.6\times$5.6× fewer FLOPs with negligible loss in accuracy for ResNet18 on CIFAR-10. For VGG16 trained on Tiny ImageNet, our approach requires $5.8 \times$5.8× fewer FLOPs and up to $\mathord {\sim }83.3\%$83.3% fewer model parameters with a drop in top-5 (top-1) accuracy of only 1.2% ($\mathord {\sim }2.1\%$2.1%). We also compared the performance of our proposed architectures with that of ShuffleNet and MobileNetV2. Using similar hyperparameters and FLOPs, our ResNet18 variants yield an average accuracy improvement of $\mathord {\sim }2.8\%$2.8%.

中文翻译:

低复杂度卷积神经网络的预定义稀疏性

处理深度卷积神经网络的高能源成本阻碍了它们在能源受限平台(如嵌入式系统和物联网设备)中的普遍部署。本文介绍了具有预定义稀疏 2D 内核的卷积层,这些内核具有在过滤器内和过滤器之间周期性重复的支持集。由于我们的周期性稀疏内核的有效存储,由于减少了 DRAM 访问,参数节省可以转化为能源效率的显着提高,从而有望显着改善训练和推理的能耗与准确性之间的权衡。为了评估这种方法,我们在 ResNet18 和 VGG16 架构的稀疏变体中使用两个广泛接受的数据集 CIFAR-10 和 Tiny ImageNet 进行了实验。与基线模型相比,$\mathord {\sim }82\%$82% 更少的模型参数 $5.6\times$5.6×CIFAR-10 上 ResNet18 的精度损失可以忽略不计的 FLOP 更少。对于在 Tiny ImageNet 上训练的 VGG16,我们的方法需要$5.8 \times$5.8× 更少的 FLOP 和高达 $\mathord {\sim }83.3\%$83.3% 模型参数减少,top-5 (top-1) 准确率仅下降 1.2% ($\mathord {\sim }2.1\%$2.1%)。我们还将我们提出的架构的性能与 ShuffleNet 和 MobileNetV2 的性能进行了比较。使用类似的超参数和 FLOP,我们的 ResNet18 变体产生了平均精度提高$\mathord {\sim }2.8\%$2.8%.
更新日期:2020-01-01
down
wechat
bug