Statistical versus neural machine translation – a case study for a medium size domain-specific bilingual corpus,Poznan Studies in Contemporary Linguistics

当前位置： X-MOL 学术 › Poznan Studies in Contemporary Linguistics › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Statistical versus neural machine translation – a case study for a medium size domain-specific bilingual corpus
Poznan Studies in Contemporary Linguistics ( IF 0.5 ) Pub Date : 2019-06-26 , DOI: 10.1515/psicl-2019-0018
Krzysztof Jassem , Tomasz Dwojak

Abstract Neural Machine Translation (NMT) has recently achieved promising results for a number of translation pairs. Although the method requires larger volumes of data and more computational power than Statistical Machine Translation (SMT), it is believed to become dominant in near future. In this paper we evaluate SMT and NMT models learned on a domain-specific English-Polish corpus of a moderate size (1,200,000 segments). The experiment shows that both solutions significantly outperform a general-domain online translator. The SMT model achieves a slightly better BLEU score than the NMT model. On the other hand, the process of decoding is noticeably faster in NMT. Human evaluation carried out on a sizeable sample of translations (2,000 pairs) reveals the superiority of the NMT approach, particularly in the aspect of output fluency.

中文翻译：

统计与神经机器翻译–中型领域特定双语语料库的案例研究

摘要神经机器翻译（NMT）最近在许多翻译对中取得了可喜的成果。尽管与统计机器翻译（SMT）相比，该方法需要更大的数据量和更多的计算能力，但据信它将在不久的将来成为主流。在本文中，我们评估了在中等规模的英语-波兰语语料库（1,200,000个句段）上学习到的SMT和NMT模型。实验表明，这两种解决方案均明显优于通用域在线翻译器。与NMT模型相比，SMT模型的BLEU得分略高。另一方面，在NMT中，解码过程明显更快。对大量翻译样本（2,000对）进行的人工评估显示了NMT方法的优越性，特别是在输出流利性方面。

更新日期：2019-06-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文