当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Full-Sentence Models Perform Better in Simultaneous Translation Using the Information Enhanced Decoding Strategy
arXiv - CS - Artificial Intelligence Pub Date : 2021-05-05 , DOI: arxiv-2105.01893
Zhengxin Yang

Simultaneous translation, which starts translating each sentence after receiving only a few words in source sentence, has a vital role in many scenarios. Although the previous prefix-to-prefix framework is considered suitable for simultaneous translation and achieves good performance, it still has two inevitable drawbacks: the high computational resource costs caused by the need to train a separate model for each latency $k$ and the insufficient ability to encode information because each target token can only attend to a specific source prefix. We propose a novel framework that adopts a simple but effective decoding strategy which is designed for full-sentence models. Within this framework, training a single full-sentence model can achieve arbitrary given latency and save computational resources. Besides, with the competence of the full-sentence model to encode the whole sentence, our decoding strategy can enhance the information maintained in the decoded states in real time. Experimental results show that our method achieves better translation quality than baselines on 4 directions: Zh$\rightarrow$En, En$\rightarrow$Ro and En$\leftrightarrow$De.

中文翻译:

使用信息增强型解码策略,全语句模型在同时翻译中表现更好

同步翻译在源句子中仅接收到几个单词后才开始翻译每个句子,在许多情况下都起着至关重要的作用。尽管以前的前缀到前缀框架被认为适合同时翻译并实现了良好的性能,但它仍然具有两个不可避免的缺点:由于需要为每个延迟$ k $训练一个单独的模型而导致的计算资源成本很高,以及不足能够对信息进行编码,因为每个目标令牌只能使用特定的源前缀。我们提出了一种新颖的框架,该框架采用了一种简单但有效的解码策略,该策略是为全语句模型设计的。在此框架内,训练单个全句模型可以实现任意给定的延迟并节省计算资源。除了,利用全句模型对整个句子进行编码的能力,我们的解码策略可以实时增强在解码状态下保持的信息。实验结果表明,与Zh $ \ rightarrow $ En,En $ \ rightarrow $ Ro和En $ \ leftrightarrow $ De相比,我们的方法在基线方向上的翻译质量更高。
更新日期:2021-05-06
down
wechat
bug