Top-Down Neural Attention by Excitation Backprop,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Top-Down Neural Attention by Excitation Backprop
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2017-12-23 , DOI: 10.1007/s11263-017-1059-x
Jianming Zhang , Sarah Adel Bargal , Zhe Lin , Jonathan Brandt , Xiaohui Shen , Stan Sclaroff

We aim to model the top-down attention of a convolutional neural network (CNN) classifier for generating task-specific attention maps. Inspired by a top-down human visual attention model, we propose a new backpropagation scheme, called Excitation Backprop, to pass along top-down signals downwards in the network hierarchy via a probabilistic Winner-Take-All process. Furthermore, we introduce the concept of contrastive attention to make the top-down attention maps more discriminative. We show a theoretic connection between the proposed contrastive attention formulation and the Class Activation Map computation. Efficient implementation of Excitation Backprop for common neural network layers is also presented. In experiments, we visualize the evidence of a model’s classification decision by computing the proposed top-down attention maps. For quantitative evaluation, we report the accuracy of our method in weakly supervised localization tasks on the MS COCO, PASCAL VOC07 and ImageNet datasets. The usefulness of our method is further validated in the text-to-region association task. On the Flickr30k Entities dataset, we achieve promising performance in phrase localization by leveraging the top-down attention of a CNN model that has been trained on weakly labeled web images. Finally, we demonstrate applications of our method in model interpretation and data annotation assistance for facial expression analysis and medical imaging tasks.

中文翻译：

通过激励反向传播自上而下的神经注意

我们的目标是对卷积神经网络 (CNN) 分类器的自上而下的注意力进行建模，以生成特定于任务的注意力图。受自上而下的人类视觉注意模型的启发，我们提出了一种新的反向传播方案，称为激励反向传播，通过概率性的“赢家通吃”过程在网络层次结构中向下传递自上而下的信号。此外，我们引入了对比注意力的概念，使自上而下的注意力图更具辨别力。我们展示了所提出的对比注意公式和类激活图计算之间的理论联系。还介绍了常见神经网络层的激励反向传播的有效实现。在实验中，我们通过计算提出的自上而下的注意力图来可视化模型分类决策的证据。对于定量评估，我们报告了我们的方法在 MS COCO、PASCAL VOC07 和 ImageNet 数据集上的弱监督定位任务中的准确性。我们的方法的实用性在文本到区域关联任务中得到了进一步验证。在 Flickr30k Entities 数据集上，我们通过利用已在弱标记网络图像上训练的 CNN 模型的自上而下的注意力，在短语定位方面取得了有希望的性能。最后，我们展示了我们的方法在面部表情分析和医学成像任务的模型解释和数据注释辅助中的应用。在 Flickr30k Entities 数据集上，我们通过利用已在弱标记网络图像上训练的 CNN 模型的自上而下的注意力，在短语定位方面取得了有希望的性能。最后，我们展示了我们的方法在面部表情分析和医学成像任务的模型解释和数据注释辅助中的应用。在 Flickr30k Entities 数据集上，我们通过利用已在弱标记网络图像上训练的 CNN 模型的自上而下的注意力，在短语定位方面取得了有希望的性能。最后，我们展示了我们的方法在面部表情分析和医学成像任务的模型解释和数据注释辅助中的应用。

更新日期：2017-12-23

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>