当前位置: X-MOL 学术Comput. Intell. Neurosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mixed-Level Neural Machine Translation
Computational Intelligence and Neuroscience ( IF 3.120 ) Pub Date : 2020-11-29 , DOI: 10.1155/2020/8859452
Thien Nguyen 1 , Huu Nguyen 2 , Phuoc Tran 3
Affiliation  

Building the first Russian-Vietnamese neural machine translation system, we faced the problem of choosing a translation unit system on which source and target embeddings are based. Available homogeneous translation unit systems with the same translation unit on the source and target sides do not perfectly suit the investigated language pair. To solve the problem, in this paper, we propose a novel heterogeneous translation unit system, considering linguistic characteristics of the synthetic Russian language and the analytic Vietnamese language. Specifically, we decrease the embedding level on the source side by splitting token into subtokens and increase the embedding level on the target side by merging neighboring tokens into supertoken. The experiment results show that the proposed heterogeneous system improves over the existing best homogeneous Russian-Vietnamese translation system by 1.17 BLEU. Our approach could be applied to building translation bots for language pairs with different linguistic characteristics.

中文翻译:

混合级神经机器翻译

建立第一个俄罗斯-越南神经机器翻译系统,我们面临选择源和目标嵌入所基于的翻译单位系统的问题。在源侧和目标侧具有相同翻译单元的可用同类翻译单元系统并不完全适合所研究的语言对。为了解决这个问题,本文提出了一种新颖的异构翻译单元系统,该系统考虑了合成俄语和解析越南语的语言特点。具体来说,我们通过将令牌拆分为子令牌来降低源端的嵌入级别,并通过将相邻令牌合并为超级令牌来提高目标端的嵌入级别。实验结果表明,所提出的异构系统比现有的最佳同质俄语-越南语翻译系统提高了1.17 BLEU。我们的方法可以应用于为具有不同语言特征的语言对构建翻译机器人。
更新日期:2020-12-01
down
wechat
bug