当前位置: X-MOL 学术Multimed. Tools Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Filter pruning by image channel reduction in pre-trained convolutional neural networks
Multimedia Tools and Applications ( IF 3.0 ) Pub Date : 2020-07-24 , DOI: 10.1007/s11042-020-09373-9
Gi Su Chung , Chee Sun Won

There are domain-specific image classification problems such as facial emotion and house-number classifications, where the color information in the images may not be crucial for recognition. This motivates us to convert RGB images to gray-scale ones with a single Y channel to be fed into the pre-trained convolutional neural networks (CNN). Now, since the existing CNN models are pre-trained by three-channel color images, one can expect that some trained filters are more sensitive to colors than brightness. Therefore, adopting the single-channel gray-scale images as inputs, we can prune out some of the convolutional filters in the first layer of the pre-trained CNN. This first-layer pruning greatly facilitates the filter compression of the subsequent convolutional layers. Now, the pre-trained CNN with the compressed filters is fine-tuned with the single-channel images for a domain-specific dataset. Experimental results on the facial emotion and Street View House Numbers (SVHN) datasets show that we can achieve a significant compression of the pre-trained CNN filters by the proposed method. For example, compared with the fine-tuned VGG-16 model by color images, we can save 10.538 GFLOPs computations, while keeping the classification accuracy around 84% for the facial emotion RAF-DB dataset.



中文翻译:

在预训练卷积神经网络中通过图像通道减少对滤波器进行修剪

存在特定领域的图像分类问题,例如面部表情和房屋编号分类,其中图像中的颜色信息对于识别而言可能不是至关重要的。这促使我们将RGB图像转换为具有单个Y通道的灰度图像,以馈入预训练的卷积神经网络(CNN)。现在,由于现有的CNN模型已通过三通道彩色图像进行了预训练,因此可以期望某些经过训练的滤镜对颜色的敏感度要比对亮度的敏感度高。因此,采用单通道灰度图像作为输入,我们可以修剪掉预训练的CNN第一层中的一些卷积滤波器。此第一层修剪极大地促进了后续卷积层的过滤器压缩。现在,带有压缩滤镜的预训练CNN会针对特定于域的数据集与单通道图像进行微调。对面部情绪和街景门牌号码(SVHN)数据集的实验结果表明,通过所提出的方法,我们可以实现对预训练的CNN过滤器的显着压缩。例如,与通过彩色图像微调的VGG-16模型相比,我们可以节省10.538 GFLOPs计算,同时将面部情感RAF-DB数据集的分类精度保持在84%左右。

更新日期:2020-07-24
down
wechat
bug