当前位置: X-MOL 学术IEEE Trans. Cybern. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Describing Video With Attention-Based Bidirectional LSTM
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 5-25-2018 , DOI: 10.1109/tcyb.2018.2831447
Yi Bin , Yang Yang , Fumin Shen , Ning Xie , Heng Tao Shen , Xuelong Li

Video captioning has been attracting broad research attention in the multimedia community. However, most existing approaches heavily rely on static visual information or partially capture the local temporal knowledge (e.g., within 16 frames), thus hardly describing motions accurately from a global view. In this paper, we propose a novel video captioning framework, which integrates bidirectional long-short term memory (BiLSTM) and a soft attention mechanism to generate better global representations for videos as well as enhance the recognition of lasting motions in videos. To generate video captions, we exploit another long-short term memory as a decoder to fully explore global contextual information. The benefits of our proposed method are two fold: 1) the BiLSTM structure comprehensively preserves global temporal and visual information and 2) the soft attention mechanism enables a language decoder to recognize and focus on principle targets from the complex content. We verify the effectiveness of our proposed video captioning framework on two widely used benchmarks, that is, microsoft video description corpus and MSR-video to text, and the experimental results demonstrate the superiority of the proposed approach compared to several state-of-the-art methods.

中文翻译:


使用基于注意力的双向 LSTM 描述视频



视频字幕一直吸引着多媒体界的广泛研究关注。然而,大多数现有方法严重依赖静态视觉信息或部分捕获局部时间知识(例如,16 帧内),因此很难从全局视图准确地描述运动。在本文中,我们提出了一种新颖的视频字幕框架,该框架集成了双向长短期记忆(BiLSTM)和软注意机制,以生成更好的视频全局表示,并增强对视频中持久运动的识别。为了生成视频字幕,我们利用另一个长期短期记忆作为解码器来充分探索全局上下文信息。我们提出的方法有两个好处:1)BiLSTM 结构全面保留全局时间和视觉信息;2)软注意力机制使语言解码器能够从复杂内容中识别并关注主要目标。我们在两个广泛使用的基准上验证了我们提出的视频字幕框架的有效性,即微软视频描述语料库和MSR视频到文本,实验结果证明了所提出的方法与几种最先进的方法相比的优越性。艺术方法。
更新日期:2024-08-22
down
wechat
bug