当前位置: X-MOL 学术IEEE J. Sel. Top. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Acceleration of Deep Convolutional Neural Networks using Adaptive Filter Pruning
IEEE Journal of Selected Topics in Signal Processing ( IF 7.5 ) Pub Date : 2020-05-01 , DOI: 10.1109/jstsp.2020.2992390
Pravendra Singh , Vinay Kumar Verma , Piyush Rai , Vinay P. Namboodiri

While convolutional neural networks (CNNs) have achieved remarkable performance on various supervised and unsupervised learning tasks, they typically consist of a massive number of parameters. This results in significant memory requirements as well as a computational burden. Consequently, there is a growing need for filter-level pruning approaches for compressing CNN based models that not only reduce the total number of parameters but reduce the overall computation as well. We present a new min-max framework for the filter-level pruning of CNNs. Our framework jointly prunes and fine-tunes CNN model parameters, with an adaptive pruning rate, while maintaining the model’s predictive performance. Our framework consists of two modules: (1) An adaptive filter pruning (AFP) module, which minimizes the number of filters in the model; and (2) A pruning rate controller (PRC) module, which maximizes the accuracy during pruning. In addition, we also introduce orthogonality regularization in training of CNNs to reduce redundancy across filters of a particular layer. In the proposed approach, we prune the least important filters and, at the same time, reduce the redundancy level in the model by using orthogonality constraints during training. Moreover, unlike most previous approaches, our approach allows directly specifying the desired error tolerance instead of the pruning level. We perform extensive experiments for object classification (LeNet, VGG, MobileNet, and ResNet) and object detection (SSD, and Faster-RCNN) over benchmarked datasets such as MNIST, CIFAR, GTSDB, ImageNet, and MS-COCO. We also present several ablation studies to validate the proposed approach. Our compressed models can be deployed at run-time, without requiring any special libraries or hardware. Our approach reduces the number of parameters of VGG-16 by an impressive factor of 17.5X, and the number of FLOPS by 6.43X, with no loss of accuracy, significantly outperforming other state-of-the-art filter pruning methods.

中文翻译:

使用自适应滤波器剪枝加速深度卷积神经网络

虽然卷积神经网络 (CNN) 在各种有监督和无监督学习任务上取得了卓越的性能,但它们通常由大量参数组成。这导致显着的内存需求以及计算负担。因此,越来越需要用于压缩基于 CNN 的模型的滤波器级修剪方法,这种方法不仅可以减少参数总数,还可以减少整体计算量。我们为 CNN 的过滤器级修剪提出了一个新的最小-最大框架。我们的框架以自适应修剪率联合修剪和微调 CNN 模型参数,同时保持模型的预测性能。我们的框架由两个模块组成:(1)自适应滤波器修剪(AFP)模块,它最小化模型中的滤波器数量;(2) 修剪率控制器 (PRC) 模块,最大限度地提高修剪过程中的准确性。此外,我们还在 CNN 的训练中引入了正交正则化,以减少特定层过滤器之间的冗余。在所提出的方法中,我们修剪了最不重要的过滤器,同时通过在训练期间使用正交性约束来降低模型中的冗余级别。此外,与大多数以前的方法不同,我们的方法允许直接指定所需的容错而不是修剪级别。我们在 MNIST、CIFAR、GTSDB、ImageNet 和 MS-COCO 等基准数据集上进行了大量的对象分类(LeNet、VGG、MobileNet 和 ResNet)和对象检测(SSD 和 Faster-RCNN)实验。我们还提出了几项消融研究来验证所提出的方法。我们的压缩模​​型可以在运行时部署,不需要任何特殊的库或硬件。我们的方法将 VGG-16 的参数数量减少了 17.5 倍,将 FLOPS 数量减少了 6.43 倍,并且没有损失准确性,明显优于其他最先进的过滤器剪枝方法。
更新日期:2020-05-01
down
wechat
bug