当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Aligning Subtitles in Sign Language Videos
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-05-06 , DOI: arxiv-2105.02877
Hannah Bull, Triantafyllos Afouras, Gül Varol, Samuel Albanie, Liliane Momeni, Andrew Zisserman

The goal of this work is to temporally align asynchronous subtitles in sign language videos. In particular, we focus on sign-language interpreted TV broadcast data comprising (i) a video of continuous signing, and (ii) subtitles corresponding to the audio content. Previous work exploiting such weakly-aligned data only considered finding keyword-sign correspondences, whereas we aim to localise a complete subtitle text in continuous signing. We propose a Transformer architecture tailored for this task, which we train on manually annotated alignments covering over 15K subtitles that span 17.7 hours of video. We use BERT subtitle embeddings and CNN video representations learned for sign recognition to encode the two signals, which interact through a series of attention layers. Our model outputs frame-level predictions, i.e., for each video frame, whether it belongs to the queried subtitle or not. Through extensive evaluations, we show substantial improvements over existing alignment baselines that do not make use of subtitle text embeddings for learning. Our automatic alignment model opens up possibilities for advancing machine translation of sign languages via providing continuously synchronized video-text data.

中文翻译:

在手语视频中对齐字幕

这项工作的目标是在手语视频中临时对齐异步字幕。特别地,我们专注于由手语解释的电视广播数据,包括(i)连续签名的视频和(ii)与音频内容相对应的字幕。以前利用这种弱对齐数据的工作仅考虑查找关键字与符号的对应关系,而我们的目标是在连续签名中定位完整的字幕文本。我们提出了针对此任务量身定制的Transformer体系结构,我们对其进行了人工注释的对齐方式进行了培训,该对齐方式涵盖了跨越17.7个小时的视频的超过15K字幕。我们将学习的BERT字幕嵌入和CNN视频表示形式用于符号识别,以对这两个信号进行编码,这两个信号会通过一系列注意层进行交互。我们的模型输出帧级预测,即对于每个视频帧,它是否属于所查询的字幕。通过广泛的评估,我们显示出在不使用字幕文本嵌入进行学习的情况下对现有对齐基准进行的重大改进。我们的自动对齐模型通过提供持续同步的视频文本数据,为推进手语的机器翻译开辟了可能性。
更新日期:2021-05-07
down
wechat
bug