当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Neural Machine Translation with Deep Attention
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2018-10-16 , DOI: 10.1109/tpami.2018.2876404
Biao Zhang , Deyi Xiong , Jinsong Su

Deepening neural models has been proven very successful in improving the model's capacity when solving complex learning tasks, such as the machine translation task. Previous efforts on deep neural machine translation mainly focus on the encoder and the decoder, while little on the attention mechanism. However, the attention mechanism is of vital importance to induce the translation correspondence between different languages where shallow neural networks are relatively insufficient, especially when the encoder and decoder are deep. In this paper, we propose a deep attention model (DeepAtt). Based on the low-level attention information, DeepAtt is capable of automatically determining what should be passed or suppressed from the corresponding encoder layer so as to make the distributed representation appropriate for high-level attention and translation. We conduct experiments on NIST Chinese-English, WMT English-German, and WMT English-French translation tasks, where, with five attention layers, DeepAtt yields very competitive performance against the state-of-the-art results. We empirically find that with an adequate increase of attention layers, DeepAtt tends to produce more accurate attention weights. An in-depth analysis on the translation of important context words further reveals that DeepAtt significantly improves the faithfulness of system translations.

中文翻译:

高度重视的神经机器翻译

在解决复杂的学习任务(例如机器翻译任务)时,深化神经模型在提高模型的能力方面被证明非常成功。以前在深度神经机器翻译上的努力主要集中在编码器和解码器上,而很少关注注意力机制。但是,注意力机制对于在浅层神经网络相对不足的情况下(尤其是在编码器和解码器较深时)诱导不同语言之间的翻译对应关系至关重要。在本文中,我们提出了一个深度注意模型(DeepAtt)。根据低层关注信息,DeepAtt能够自动确定应从相应的编码器层传递或抑制的内容,以使分布式表示适合于高级关注和翻译。我们针对NIST的中文-英语,WMT的英语-德语和WMT的英语-法语翻译任务进行实验,其中DeepAtt具有五个关注层,与最新结果相比,它们具有非常具有竞争力的性能。从经验上我们发现,随着注意力层的适当增加,DeepAtt倾向于产生更准确的注意力权重。对重要上下文单词的翻译进行的深入分析进一步表明,DeepAtt大大提高了系统翻译的忠诚度。DeepAtt具有五个注意层,与最新结果相比,具有极强的竞争力。从经验上我们发现,随着注意力层的适当增加,DeepAtt倾向于产生更准确的注意力权重。对重要上下文单词的翻译进行的深入分析进一步表明,DeepAtt大大提高了系统翻译的忠诚度。DeepAtt具有五个注意层,与最新结果相比,具有极强的竞争力。从经验上我们发现,随着注意力层的适当增加,DeepAtt倾向于产生更准确的注意力权重。对重要上下文单词的翻译进行的深入分析进一步表明,DeepAtt大大提高了系统翻译的忠诚度。
更新日期:2019-12-06
down
wechat
bug