Improving Neural Machine Translation Model with Deep Encoding Information,Cognitive Computation

当前位置： X-MOL 学术 › Cognit. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Improving Neural Machine Translation Model with Deep Encoding Information
Cognitive Computation ( IF 4.3 ) Pub Date : 2021-05-15 , DOI: 10.1007/s12559-021-09860-7
Guiduo Duan , Haobo Yang , Ke Qin , Tianxi Huang

Availability of very high computational power along with the development of deep neural network (DNN) technology has enabled rapid progress of machine translation technology. The powerful representation ability of the deep neural network also enables the neural machine translation technology (NMT) to exploit the available large-scale bilingual parallel corpus as well as the computing power to provide a highly effective translation model. Nevertheless, the existing neural machine translation models only utilize the top layer encoder information, whereas the information available in deeper encoding layers is often ignored. This significantly constrains the performance of the translation model. To address this issue, in this paper, we propose a novel neural machine translation model which can fully exploit the deep encoding information. The core idea is to use different ways of aggregating the information from different encoding layers. We further design three different aggregation strategies including parallel layer, multi-layer, and dynamic layer encoding information aggregations. Three translation models are correspondingly trained and compared with the baseline transformer model for the Chinese-to-English translation task. The experimental results indicate that the BLEU-4 score of the proposed model has been increased by 0.89 compared with that of the benchmark model. Experiments demonstrate the effectiveness of the proposed method.

中文翻译：

利用深度编码信息改进神经机器翻译模型

随着超高计算能力的出现以及深度神经网络（DNN）技术的发展，机器翻译技术得到了飞速发展。深度神经网络的强大表示能力还使神经机器翻译技术（NMT）能够利用可用的大规模双语并行语料库以及计算能力来提供高效的翻译模型。然而，现有的神经机器翻译模型仅利用顶层编码器信息，而在较深层编码层中可用的信息通常被忽略。这极大地限制了翻译模型的性能。为了解决这个问题，在本文中，我们提出了一种可以充分利用深度编码信息的新型神经机器翻译模型。核心思想是使用不同的方法来汇总来自不同编码层的信息。我们进一步设计了三种不同的聚合策略，包括并行层，多层和动态层编码信息聚合。相应地训练了三种翻译模型，并将它们与用于汉英翻译任务的基准转换器模型进行了比较。实验结果表明，与基准模型相比，该模型的BLEU-4得分提高了0.89。实验证明了该方法的有效性。相应地训练了三种翻译模型，并将它们与用于汉英翻译任务的基准转换器模型进行了比较。实验结果表明，与基准模型相比，该模型的BLEU-4得分提高了0.89。实验证明了该方法的有效性。相应地训练了三种翻译模型，并将它们与用于汉英翻译任务的基准转换器模型进行了比较。实验结果表明，与基准模型相比，该模型的BLEU-4得分提高了0.89。实验证明了该方法的有效性。

更新日期：2021-05-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11