A "Network Pruning Network" Approach to Deep Model Compression,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A "Network Pruning Network" Approach to Deep Model Compression
arXiv - CS - Machine Learning Pub Date : 2020-01-15 , DOI: arxiv-2001.05545
Vinay Kumar Verma, Pravendra Singh, Vinay P. Namboodiri, Piyush Rai

We present a filter pruning approach for deep model compression, using a multitask network. Our approach is based on learning a a pruner network to prune a pre-trained target network. The pruner is essentially a multitask deep neural network with binary outputs that help identify the filters from each layer of the original network that do not have any significant contribution to the model and can therefore be pruned. The pruner network has the same architecture as the original network except that it has a multitask/multi-output last layer containing binary-valued outputs (one per filter), which indicate which filters have to be pruned. The pruner's goal is to minimize the number of filters from the original network by assigning zero weights to the corresponding output feature-maps. In contrast to most of the existing methods, instead of relying on iterative pruning, our approach can prune the network (original network) in one go and, moreover, does not require specifying the degree of pruning for each layer (and can learn it instead). The compressed model produced by our approach is generic and does not need any special hardware/software support. Moreover, augmenting with other methods such as knowledge distillation, quantization, and connection pruning can increase the degree of compression for the proposed approach. We show the efficacy of our proposed approach for classification and object detection tasks.

中文翻译：

深度模型压缩的“网络剪枝网络”方法

我们提出了一种使用多任务网络进行深度模型压缩的过滤器修剪方法。我们的方法基于学习修剪器网络来修剪预训练的目标网络。pruner 本质上是一个多任务深度神经网络，具有二进制输出，有助于识别原始网络每一层的过滤器，这些过滤器对模型没有任何显着贡献，因此可以被修剪。pruner 网络与原始网络具有相同的架构，不同之处在于它的多任务/多输出最后一层包含二进制值输出（每个过滤器一个），指示必须修剪哪些过滤器。pruner 的目标是通过为相应的输出特征图分配零权重来最小化原始网络中过滤器的数量。与大多数现有方法相比，我们的方法不是依赖迭代修剪，而是可以一次性修剪网络（原始网络），而且不需要指定每一层的修剪程度（并且可以学习它）。我们的方法产生的压缩模型是通用的，不需要任何特殊的硬件/软件支持。此外，使用其他方法（如知识蒸馏、量化和连接修剪）进行扩充可以增加所提出方法的压缩程度。我们展示了我们提出的分类和对象检测任务方法的有效性。我们的方法产生的压缩模型是通用的，不需要任何特殊的硬件/软件支持。此外，使用其他方法（如知识蒸馏、量化和连接修剪）进行扩充可以增加所提出方法的压缩程度。我们展示了我们提出的分类和对象检测任务方法的有效性。我们的方法产生的压缩模型是通用的，不需要任何特殊的硬件/软件支持。此外，使用其他方法（如知识蒸馏、量化和连接修剪）进行扩充可以增加所提出方法的压缩程度。我们展示了我们提出的分类和目标检测任务方法的有效性。

更新日期：2020-01-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文