当前位置: X-MOL 学术Signal Process. Image Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-scale relation reasoning for multi-modal Visual Question Answering
Signal Processing: Image Communication ( IF 3.5 ) Pub Date : 2021-05-14 , DOI: 10.1016/j.image.2021.116319
Yirui Wu , Yuntao Ma , Shaohua Wan

The goal of Visual Question Answering (VQA) is to answer questions about images. For the same picture, there are often completely different types of questions. Therefore, the main difficulty of VQA task lies in how to properly reason relationships among multiple visual objects according to types of input questions. To solve this difficulty, this paper proposes a deep neural network to perform multi-modal relation reasoning in multi-scales, which successfully constructs a regional attention scheme to focus on informative and question-related regions for better answering. Specifically, we firstly design regional attention scheme to select regions of interest based on informative evaluation computed by a question-guided soft attention module. Afterwards, features computed by regional attention scheme are fused in scaled combinations, thus generating more distinctive features with scalable information. Due to designs of regional attention and multi-scale property, the proposed method is capable to describe scaled relationships from multi-modal inputs to offer accurate question-guided answers. By conducting experiments on VQA v1 and VQA v2 datasets, we show that the proposed method has superior efficiencies than most of the existing methods.



中文翻译:

多模态视觉问答的多尺度关系推理

视觉问题解答(VQA)的目的是回答有关图像的问题。对于同一张图片,通常存在完全不同类型的问题。因此,VQA任务的主要困难在于如何根据输入问题的类型正确地推理多个视觉对象之间的关系。为了解决这一难题,本文提出了一种深度神经网络,可以在多尺度上执行多模态关系推理,从而成功构建了一个区域注意方案,重点关注信息丰富和与问题相关的区域,以便更好地回答。具体而言,我们首先设计区域注意方案,以基于问题指导的软注意模块计算出的信息评估来选择感兴趣的区域。然后,将按区域注意力计划计算出的特征按比例组合在一起,从而通过可伸缩的信息生成更多与众不同的功能。由于设计了区域注意力和多尺度属性,因此所提出的方法能够描述多模式输入的尺度关系,以提供准确的问题指导答案。通过对VQA v1和VQA v2数据集进行实验,我们证明了该方法比大多数现有方法具有更高的效率。

更新日期:2021-05-18
down
wechat
bug