当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adapting Grad-CAM for Embedding Networks
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-01-17 , DOI: arxiv-2001.06538
Lei Chen, Jianhui Chen, Hossein Hajimirsadeghi and Greg Mori

The gradient-weighted class activation mapping (Grad-CAM) method can faithfully highlight important regions in images for deep model prediction in image classification, image captioning and many other tasks. It uses the gradients in back-propagation as weights (grad-weights) to explain network decisions. However, applying Grad-CAM to embedding networks raises significant challenges because embedding networks are trained by millions of dynamically paired examples (e.g. triplets). To overcome these challenges, we propose an adaptation of the Grad-CAM method for embedding networks. First, we aggregate grad-weights from multiple training examples to improve the stability of Grad-CAM. Then, we develop an efficient weight-transfer method to explain decisions for any image without back-propagation. We extensively validate the method on the standard CUB200 dataset in which our method produces more accurate visual attention than the original Grad-CAM method. We also apply the method to a house price estimation application using images. The method produces convincing qualitative results, showcasing the practicality of our approach.

中文翻译:

为嵌入网络调整 Grad-CAM

梯度加权类激活映射 (Grad-CAM) 方法可以忠实地突出图像中的重要区域,用于图像分类、图像字幕和许多其他任务中的深度模型预测。它使用反向传播中的梯度作为权重(grad-weights)来解释网络决策。然而,将 Grad-CAM 应用于嵌入网络会带来重大挑战,因为嵌入网络是由数百万个动态配对示例(例如三元组)训练的。为了克服这些挑战,我们提出了一种适应 Grad-CAM 方法的嵌入网络。首先,我们从多个训练示例中聚合 grad-weights 以提高 Grad-CAM 的稳定性。然后,我们开发了一种有效的权重转移方法来解释没有反向传播的任何图像的决策。我们在标准 CUB200 数据集上广泛验证了该方法,其中我们的方法比原始 Grad-CAM 方法产生更准确的视觉注意力。我们还将该方法应用于使用图像的房价估算应用程序。该方法产生令人信服的定性结果,展示了我们方法的实用性。
更新日期:2020-01-22
down
wechat
bug