当前位置: X-MOL 学术Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Image content-dependent steerable kernels
The Visual Computer ( IF 3.5 ) Pub Date : 2021-04-26 , DOI: 10.1007/s00371-021-02128-z
Xiang Ye , Heng Wang , Yong Li

Attention mechanism plays an essential role in many tasks such as image classification, object detection, and instance segmentation. However, existing methods typically assigned attention weights to feature maps of the previous layer. The kernels in current layer remained static during the inference stage. To explicitly model the dependency of individual kernel weights on image content at the inference stage, this work proposed attention weight block (AWB) that converts kernels to be steerable to the content in a test image. Specifically, AWB computes a set of on-the-fly coefficients according to the feature maps of the previous layer and applies the coefficients to the kernels in current layers, which makes them steerable. AWB kernels emphasize or suppress the weights of certain kernels depending on the content of input samples and hence significantly improve the feature representation ability of deep neural networks. The proposed AWB is evaluated on various datasets, and experimental results show that steerable kernels in AWB outperformed the state-of-the-art attention approaches when embedded in the architecture for classification, object detection, and semantic segmentation tasks. It outperforms ECA by 1.1% and 1.0% on CIFAR-100 and Tiny ImageNet datasets, respectively, for image classification task; outperforms CornerNet-Lite by 1.5% on COCO2017 dataset for object detection task; and outperforms FCN8s by 1.2% on SBUshadow dataset for semantic segmentation task.



中文翻译:

图像内容相关的可控内核

注意机制在许多任务中起着至关重要的作用,例如图像分类,对象检测和实例分割。然而,现有方法通常将注意力权重分配给前一层的特征图。在推理阶段,当前层中的内核保持静态。为了在推断阶段显式地建模各个内核权重对图像内容的依赖性,这项工作提出了关注权重块(AWB),该功能将内核转换为可转向测试图像中的内容。具体来说,AWB根据上一层的特征图计算一组即时系数,并将这些系数应用于当前层的内核,从而使它们易于控制。AWB内核根据输入样本的内容来强调或抑制某些内核的权重,从而显着提高深度神经网络的特征表示能力。拟议的AWB在各种数据集上进行了评估,实验结果表明,将AWB中的可操纵内核嵌入到用于分类,对象检测和语义分割任务的体系结构中,其性能优于最新的注意力方法。在图像分类任务上,它在CIFAR-100和Tiny ImageNet数据集上的性能分别比ECA高出1.1%和1.0%。在目标检测任务上,COCO2017数据集的CornerNet-Lite优于1.5%; 在语义分割任务上,SBUshadow数据集的性能优于FCN8 1.2%。拟议的AWB在各种数据集上进行了评估,实验结果表明,将AWB中的可操纵内核嵌入到用于分类,对象检测和语义分割任务的体系结构中,其性能优于最新的注意力方法。在图像分类任务上,它在CIFAR-100和Tiny ImageNet数据集上的性能分别比ECA高出1.1%和1.0%。在目标检测任务上,COCO2017数据集的CornerNet-Lite优于1.5%; 在语义分割任务上,SBUshadow数据集的性能优于FCN8 1.2%。拟议的AWB在各种数据集上进行了评估,实验结果表明,将AWB中的可操纵内核嵌入到用于分类,对象检测和语义分割任务的体系结构中,其性能优于最新的注意力方法。在图像分类任务上,它在CIFAR-100和Tiny ImageNet数据集上的性能分别比ECA高出1.1%和1.0%。在目标检测任务上,COCO2017数据集的CornerNet-Lite优于1.5%; 在语义分割任务上,在SBUshadow数据集上的表现优于FCN8 1.2%。对于图像分类任务,分别在CIFAR-100和Tiny ImageNet数据集上分别为0%;在目标检测任务上,COCO2017数据集的CornerNet-Lite优于1.5%; 在语义分割任务上,在SBUshadow数据集上的表现优于FCN8 1.2%。对于图像分类任务,分别在CIFAR-100和Tiny ImageNet数据集上分别为0%;在目标检测任务上,COCO2017数据集的CornerNet-Lite优于1.5%; 在语义分割任务上,SBUshadow数据集的性能优于FCN8 1.2%。

更新日期:2021-04-27
down
wechat
bug