当前位置: X-MOL 学术Artif. Intell. Rev. › 论文详情
Visual question answering: a state-of-the-art review
Artificial Intelligence Review ( IF 5.095 ) Pub Date : 2020-04-08 , DOI: 10.1007/s10462-020-09832-7
Sruthy Manmadhan, Binsu C. Kovoor

Visual question answering (VQA) is a task that has received immense consideration from two major research communities: computer vision and natural language processing. Recently it has been widely accepted as an AI-complete task which can be used as an alternative to visual turing test. In its most common form, it is a multi-modal challenging task where a computer is required to provide the correct answer for a natural language question asked about an input image. It attracts many deep learning researchers after their remarkable achievements in text, voice and vision technologies. This review extensively and critically examines the current status of VQA research in terms of step by step solution methodologies, datasets and evaluation metrics. Finally, this paper also discusses future research directions for all the above-mentioned aspects of VQA separately.
更新日期:2020-04-20

 

全部期刊列表>>
智控未来
聚焦商业经济政治法律
跟Nature、Science文章学绘图
控制与机器人
招募海内外科研人才,上自然官网
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
x-mol收录
湖南大学化学化工学院刘松
上海有机所
廖良生
南方科技大学
西湖大学
伊利诺伊大学香槟分校
徐明华
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug