当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generative Imagination Elevates Machine Translation
arXiv - CS - Computation and Language Pub Date : 2020-09-21 , DOI: arxiv-2009.09654
Quanyu Long, Mingxuan Wang, Lei Li

There are thousands of languages on earth, but visual perception is shared among peoples. Existing multimodal neural machine translation (MNMT) methods achieve knowledge transfer by enforcing one encoder to learn shared representation across textual and visual modalities. However, the training and inference process heavily relies on well-aligned bilingual sentence - image triplets as input, which are often limited in quantity. In this paper, we hypothesize that visual imagination via synthesizing visual representation from source text could help the neural model map two languages with different symbols, thus helps the translation task. Our proposed end-to-end imagination-based machine translation model (ImagiT) first learns to generate semantic-consistent visual representation from source sentence, and then generate target sentence based on both text representation and imagined visual representation. Experiments demonstrate that our translation model benefits from visual imagination and significantly outperforms the text-only neural machine translation (NMT) baseline. We also conduct analyzing experiments, and the results show that imagination can help fill in missing information when performing the degradation strategy.

中文翻译:

生成想象提升机器翻译

地球上有数以千计的语言,但视觉感知是人与人之间共享的。现有的多模态神经机器翻译 (MNMT) 方法通过强制一个编码器学习跨文本和视觉模态的共享表示来实现知识转移。然而,训练和推理过程在很大程度上依赖于对齐良好的双语句子 - 图像三元组作为输入,通常数量有限。在本文中,我们假设通过从源文本合成视觉表示的视觉想象可以帮助神经模型映射具有不同符号的两种语言,从而有助于翻译任务。我们提出的端到端基于想象力的机器翻译模型(ImagiT)首先学习从源语句生成语义一致的视觉表示,然后基于文本表示和想象的视觉表示生成目标句子。实验表明,我们的翻译模型受益于视觉想象力,并且明显优于纯文本神经机器翻译 (NMT) 基线。我们还进行了分析实验,结果表明,在执行退化策略时,想象力可以帮助填补缺失的信息。
更新日期:2020-09-22
down
wechat
bug