当前位置: X-MOL 学术Pattern Recogn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Delving deep into spatial pooling for squeeze-and-excitation networks
Pattern Recognition ( IF 8 ) Pub Date : 2021-07-15 , DOI: 10.1016/j.patcog.2021.108159
Xin Jin 1 , Yanping Xie 2 , Xiu-Shen Wei 3, 4 , Bo-Rui Zhao 1 , Zhao-Min Chen 1 , Xiaoyang Tan 2
Affiliation  

Squeeze-and-Excitation (SE) blocks have demonstrated significant accuracy gains for state-of-the-art deep architectures by re-weighting channel-wise feature responses. The SE block is an architecture unit that integrates two operations: a squeeze operation that employs global average pooling to aggregate spatial convolutional features into a channel feature, and an excitation operation that learns instance-specific channel weights from the squeezed feature to re-weight each channel. In this paper, we revisit the squeeze operation in SE blocks, and shed lights on why and how to embed rich (both global and local) information into the excitation module at minimal extra costs. In particular, we introduce a simple but effective two-stage spatial pooling process: rich descriptor extraction and information fusion. The rich descriptor extraction step aims to obtain a set of diverse (i.e., global and especially local) deep descriptors that contain more informative cues than global average-pooling. While, absorbing more information delivered by these descriptors via a fusion step can aid the excitation operation to return more accurate re-weight scores in a data-driven manner. We validate the effectiveness of our method by extensive experiments on ImageNet for image classification and on MS-COCO for object detection and instance segmentation. For these experiments, our method achieves consistent improvements over the SENets on all tasks, in some cases, by a large margin.



中文翻译:

深入研究挤压激励网络的空间池化

通过重新加权通道特征响应,Squeeze-and-Excitation (SE) 块已经证明了最先进的深度架构的显着精度提升。SE 块是一个架构单元,它集成了两个操作:使用全局平均池化将空间卷积特征聚合为通道特征的挤压操作,以及从挤压的特征中学习特定于实例的通道权重以重新加权每个的激励操作渠道。在本文中,我们重新审视了 SE 块中的挤压操作,并阐明了为什么以及如何以最小的额外成本将丰富的(全局局部)信息嵌入到激励模块中。特别是,我们引入了一个简单但有效的两阶段空间池化过程:丰富的描述符提取信息融合。丰富的描述符提取步骤旨在获得一组不同的(全局的,尤其是局部的)深度描述符,这些描述符包含比全局平均池化更多的信息线索。同时,通过融合步骤吸收这些描述符传递的更多信息可以帮助激励操作以数据驱动的方式返回更准确的重新加权分数。我们通过在 ImageNet 上进行图像分类和 MS-COCO 上用于对象检测和实例分割的大量实验来验证我们方法的有效性。对于这些实验,我们的方法在所有任务上都实现了对 SENet 的一致改进,在某些情况下,大幅度提高。

更新日期:2021-07-30
down
wechat
bug