PAC-Bayesian framework based drop-path method for 2D discriminative convolutional network pruning,Multidimensional Systems and Signal Processing

当前位置： X-MOL 学术 › Multidimens. Syst. Signal Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

PAC-Bayesian framework based drop-path method for 2D discriminative convolutional network pruning
Multidimensional Systems and Signal Processing ( IF 1.7 ) Pub Date : 2019-10-19 , DOI: 10.1007/s11045-019-00686-z
Qinghe Zheng , Xinyu Tian , Mingqiang Yang , Yulin Wu , Huake Su

Deep convolutional neural networks (CNNs) have demonstrated its extraordinary power on various visual tasks like object detection and classification. However, it is still challenging to deploy state-of-the-art models into real-world applications, such as autonomous vehicles, due to their expensive computation costs. In this paper, to accelerate the network inference, we introduce a novel pruning method named Drop-path to reduce model parameters of 2D deep CNNs. Given a trained deep CNN, pruning paths with different lengths is achieved by ordering the influence of neurons in each layer on the probably approximately correct (PAC) Bayesian boundary of the model. We believe that the invariance of PAC-Bayesian boundary is an important factor to guarantee the generalization ability of deep CNN under the condition of optimizing as much as possible. To the best of our knowledge, this is the first time to reduce model size based on the generalization error boundary. After pruning, we observe that the convolutional kernels themselves become sparse, rather than some being removed directly. In fact, Drop-path is generic and can be well generalized on multi-layer and multi-branch models, since parameter ranking criterion can be applied to any kind of layer and the importance scores can still be propagated. Finally, Drop-path is evaluated on two image classification benchmark datasets (ImageNet and CIFAR-10) with multiple deep CNN models, including AlexNet, VGG-16, GoogLeNet, and ResNet-34/50/56/110. Experimental results demonstrate that Drop-path achieves significant model compression and acceleration with negligible accuracy loss.

中文翻译：

基于PAC-Bayesian框架的2D判别卷积网络修剪的drop-path方法

深度卷积神经网络 (CNN) 已在各种视觉任务（如对象检测和分类）中展示了其非凡的能力。然而，由于其昂贵的计算成本，将最先进的模型部署到现实世界的应用程序中仍然具有挑战性，例如自动驾驶汽车。在本文中，为了加速网络推理，我们引入了一种名为 Drop-path 的新剪枝方法来减少 2D 深度 CNN 的模型参数。给定一个经过训练的深度 CNN，通过对每层神经元对模型的可能近似正确 (PAC) 贝叶斯边界的影响进行排序来实现不同长度的修剪路径。我们认为PAC-贝叶斯边界的不变性是保证深度CNN在尽可能优化的条件下泛化能力的重要因素。据我们所知，这是第一次基于泛化误差边界来减小模型大小。修剪后，我们观察到卷积核本身变得稀疏，而不是直接删除一些。事实上，Drop-path 是通用的，可以很好地推广到多层和多分支模型上，因为参数排序标准可以应用于任何类型的层，并且重要性分数仍然可以传播。最后，在具有多个深度 CNN 模型（包括 AlexNet、VGG-16、GoogLeNet 和 ResNet-34/50/56/110）的两个图像分类基准数据集（ImageNet 和 CIFAR-10）上评估 Drop-path。实验结果表明，Drop-path 实现了显着的模型压缩和加速，而精度损失可以忽略不计。这是第一次基于泛化误差边界来减小模型大小。修剪后，我们观察到卷积核本身变得稀疏，而不是直接删除一些。事实上，Drop-path 是通用的，可以很好地推广到多层和多分支模型上，因为参数排序标准可以应用于任何类型的层，并且重要性分数仍然可以传播。最后，在具有多个深度 CNN 模型（包括 AlexNet、VGG-16、GoogLeNet 和 ResNet-34/50/56/110）的两个图像分类基准数据集（ImageNet 和 CIFAR-10）上评估 Drop-path。实验结果表明，Drop-path 实现了显着的模型压缩和加速，而精度损失可以忽略不计。这是第一次基于泛化误差边界来减小模型大小。修剪后，我们观察到卷积核本身变得稀疏，而不是直接删除一些。事实上，Drop-path 是通用的，可以很好地推广到多层和多分支模型上，因为参数排序标准可以应用于任何类型的层，并且重要性分数仍然可以传播。最后，在具有多个深度 CNN 模型（包括 AlexNet、VGG-16、GoogLeNet 和 ResNet-34/50/56/110）的两个图像分类基准数据集（ImageNet 和 CIFAR-10）上评估 Drop-path。实验结果表明，Drop-path 实现了显着的模型压缩和加速，而精度损失可以忽略不计。我们观察到卷积核本身变得稀疏，而不是一些被直接删除。事实上，Drop-path 是通用的，可以很好地推广到多层和多分支模型上，因为参数排序标准可以应用于任何类型的层，并且重要性分数仍然可以传播。最后，在具有多个深度 CNN 模型（包括 AlexNet、VGG-16、GoogLeNet 和 ResNet-34/50/56/110）的两个图像分类基准数据集（ImageNet 和 CIFAR-10）上评估 Drop-path。实验结果表明，Drop-path 实现了显着的模型压缩和加速，而精度损失可以忽略不计。我们观察到卷积核本身变得稀疏，而不是一些被直接删除。事实上，Drop-path 是通用的，可以很好地推广到多层和多分支模型上，因为参数排序标准可以应用于任何类型的层，并且重要性分数仍然可以传播。最后，在具有多个深度 CNN 模型（包括 AlexNet、VGG-16、GoogLeNet 和 ResNet-34/50/56/110）的两个图像分类基准数据集（ImageNet 和 CIFAR-10）上评估 Drop-path。实验结果表明，Drop-path 实现了显着的模型压缩和加速，而精度损失可以忽略不计。因为参数排序标准可以应用于任何类型的层，并且重要性分数仍然可以传播。最后，在具有多个深度 CNN 模型（包括 AlexNet、VGG-16、GoogLeNet 和 ResNet-34/50/56/110）的两个图像分类基准数据集（ImageNet 和 CIFAR-10）上评估 Drop-path。实验结果表明，Drop-path 实现了显着的模型压缩和加速，而精度损失可以忽略不计。因为参数排序标准可以应用于任何类型的层，并且重要性分数仍然可以传播。最后，在具有多个深度 CNN 模型（包括 AlexNet、VGG-16、GoogLeNet 和 ResNet-34/50/56/110）的两个图像分类基准数据集（ImageNet 和 CIFAR-10）上评估 Drop-path。实验结果表明，Drop-path 实现了显着的模型压缩和加速，而精度损失可以忽略不计。

更新日期：2019-10-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11