当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Progressive Multi-Granularity Training for Non-Autoregressive Translation
arXiv - CS - Computation and Language Pub Date : 2021-06-10 , DOI: arxiv-2106.05546
Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, Dacheng Tao, Zhaopeng Tu

Non-autoregressive translation (NAT) significantly accelerates the inference process via predicting the entire target sequence. However, recent studies show that NAT is weak at learning high-mode of knowledge such as one-to-many translations. We argue that modes can be divided into various granularities which can be learned from easy to hard. In this study, we empirically show that NAT models are prone to learn fine-grained lower-mode knowledge, such as words and phrases, compared with sentences. Based on this observation, we propose progressive multi-granularity training for NAT. More specifically, to make the most of the training data, we break down the sentence-level examples into three types, i.e. words, phrases, sentences, and with the training goes, we progressively increase the granularities. Experiments on Romanian-English, English-German, Chinese-English, and Japanese-English demonstrate that our approach improves the phrase translation accuracy and model reordering ability, therefore resulting in better translation quality against strong NAT baselines. Also, we show that more deterministic fine-grained knowledge can further enhance performance.

中文翻译:

非自回归翻译的渐进式多粒度训练

非自回归翻译 (NAT) 通过预测整个目标序列显着加快了推理过程。然而,最近的研究表明,NAT 在学习一对多翻译等高级知识模式方面很弱。我们认为模式可以分为各种粒度,可以从易到难学习。在这项研究中,我们凭经验表明,与句子相比,NAT 模型更容易学习细粒度的低级知识,例如单词和短语。基于这一观察,我们提出了 NAT 的渐进式多粒度训练。更具体地说,为了充分利用训练数据,我们将句子级别的示例分解为三种类型,即单词、短语、句子,并且随着训练的进行,我们逐渐增加粒度。罗马尼亚语-英语、英语-德语、中文-英文和日文-英文表明我们的方法提高了短语翻译的准确性和模型重新排序的能力,从而在强大的 NAT 基线下产生更好的翻译质量。此外,我们表明更多确定性的细粒度知识可以进一步提高性能。
更新日期:2021-06-11
down
wechat
bug