An Overview of Image Caption Generation Methods.,Computational Intelligence and Neuroscience

当前位置： X-MOL 学术 › Comput. Intell. Neurosci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An Overview of Image Caption Generation Methods.
Computational Intelligence and Neuroscience ( IF 3.120 ) Pub Date : 2020-01-09 , DOI: 10.1155/2020/3062706
Haoran Wang ₁ , Yue Zhang ₁ , Xiaosheng Yu ₂

Affiliation

In recent years, with the rapid development of artificial intelligence, image caption has gradually attracted the attention of many researchers in the field of artificial intelligence and has become an interesting and arduous task. Image caption, automatically generating natural language descriptions according to the content observed in an image, is an important part of scene understanding, which combines the knowledge of computer vision and natural language processing. The application of image caption is extensive and significant, for example, the realization of human-computer interaction. This paper summarizes the related methods and focuses on the attention mechanism, which plays an important role in computer vision and is recently widely used in image caption generation tasks. Furthermore, the advantages and the shortcomings of these methods are discussed, providing the commonly used datasets and evaluation criteria in this field. Finally, this paper highlights some open challenges in the image caption task.

中文翻译：

图像标题生成方法概述。

近年来，随着人工智能的飞速发展，图像字幕逐渐引起了人工智能领域许多研究人员的关注，成为一项有趣而艰巨的任务。根据图像中观察到的内容自动生成自然语言描述的图像标题是场景理解的重要部分，它结合了计算机视觉和自然语言处理的知识。图像字幕的应用是广泛而有意义的，例如，人机交互的实现。本文总结了相关的方法，并着重介绍了注意力机制，它在计算机视觉中起着重要的作用，最近在图像字幕生成任务中得到了广泛的应用。此外，讨论了这些方法的优点和缺点，提供了该领域常用的数据集和评估标准。最后，本文重点介绍了图像字幕任务中的一些开放挑战。

更新日期：2020-01-09

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>