Statistical versus neural machine translation – a case study for a medium size domain-specific bilingual corpus

Krzysztof Jassem; Tomasz Dwojak

doi:10.1515/psicl-2019-0018

Published by De Gruyter Mouton August 17, 2019

Statistical versus neural machine translation – a case study for a medium size domain-specific bilingual corpus

Krzysztof Jassem and Tomasz Dwojak

From the journal Poznan Studies in Contemporary Linguistics

https://doi.org/10.1515/psicl-2019-0018

Showing a limited preview of this publication:

Abstract

Neural Machine Translation (NMT) has recently achieved promising results for a number of translation pairs. Although the method requires larger volumes of data and more computational power than Statistical Machine Translation (SMT), it is believed to become dominant in near future. In this paper we evaluate SMT and NMT models learned on a domain-specific English-Polish corpus of a moderate size (1,200,000 segments). The experiment shows that both solutions significantly outperform a general-domain online translator. The SMT model achieves a slightly better BLEU score than the NMT model. On the other hand, the process of decoding is noticeably faster in NMT. Human evaluation carried out on a sizeable sample of translations (2,000 pairs) reveals the superiority of the NMT approach, particularly in the aspect of output fluency.

Keywords: Statistical machine translation; neural machine translation

Krzysztof Jassem Adam Mickiewicz University Umultowska 87 61-614 Poznań Poland

References

Artstein, R. and M. Poesio. 2008. “Inter-coder agreement for computational linguistics”. Computational Linguistics 34 (4). 555–596. <https://doi.org/10.1162/coli.07-034-RS>10.1162/coli.07-034-R2Search in Google Scholar

Bahdanau, D., K. Cho and Y. Bengio. 2014. “Neural Machine Translation by jointly learning to align and translate”. arXiv Preprint arXiv:1409.0473Search in Google Scholar

Chen, B. and C. Cherry. 2014. “A systematic comparison of smoothing techniques for sentence-level BLEU”. In Proceedings of the Ninth Workshop on Statistical Machine Translation. 362–367.10.3115/v1/W14-3346Search in Google Scholar

Cherry, C. and G. Foster. 2012. “Batch tuning strategies for Statistical Machine Translation”. Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies NAACL Hlt ’12. Stroudsburg, PA: Association for Computational Linguistics. 427– 436. <http://dl.acm.org/citation.cfm?id=2382029.2382089>Search in Google Scholar

Cho, K., B. van Merriënboer, Ç. Gülçehre, D. Bahdanau, F. Bougares, H. Schwenk and Y. Bengio. 2014. “Learning phrase representations using RNN encoder–decoder for Statistical Machine Translation”. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (Emnlp) Doha: Association for Computational Linguistics. 1724–1734. <http://www.aclweb.org/anthology/D14-1179>10.3115/v1/D14-1179Search in Google Scholar

Durrani, N., H. Schmid, A.M. Fraser, P. Koehn and H. Schütze. 2015. “The operation sequence model – combining n-gram-based and phrase-based Statistical Machine Translation”. Computational Linguistics 41. 185–214.10.1162/COLI_a_00218Search in Google Scholar

Dyer, C., V. Chahuneau and N.A. Smith. 2013. “A simple, fast and effective reparameterization of IBM Model 2”. Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Atlanta, Georgia: Association for Computational Linguistics. 644–648. <http://www.aclweb.org/anthology/N13-1073>Search in Google Scholar

Gehring, J., M. Auli, D. Grangier, D. Yarats and Y.N. Dauphin. 2017. “Convolutional Sequence to Sequence Learning”. ArXiv E-Prints May. <http://arxiv.org/abs/1705.03122>Search in Google Scholar

Heafield, K. 2011. “KenLM: Faster and smaller language model queries”. Proceedings of the Sixth Workshop on Statistical Machine Translation WMT ’11. Stroudsburg, PA, USA: Association for Computational Linguistics. 187–197. <http://dl.acm.org/citation.cfm?id=2132960.2132986>Search in Google Scholar

Hoang, H., T. Dwojak, R. Krislauks, D. Torregrosa and K. Heafield. 2018. “Fast Neural Machine Translation Implementation”. Proceedings of the Nmt 2018 Association for Computational Linguistics.10.18653/v1/W18-2714Search in Google Scholar

Hochreiter, S. and J. Schmidhuber. 1997. “Long short-term memory”. Neural Computation O. 1735–1780.10.1162/neco.1997.9.8.1735Search in Google Scholar

Junczys-Dowmunt, M. 2012. “Phrasal rank-encoding: Exploiting phrase redundancy and translational relations for phrase table compression”. The Prague Bulletin of Mathematical Linguistics 98. 63–74.10.2478/v10108-012-0009-6Search in Google Scholar

Junczys-Dowmunt, M., T. Dwojak and H. Hoang. 2016. “Is Neural Machine Translation ready for deployment? A case study on 30 translation directions”. arXiv Preprint arXiv:1610.01108Search in Google Scholar

Junczys-Dowmunt, M., T. Dwojak and R. Sennrich. 2016. “The AMU-UEDIN submission to the WMT16 News Translation Task: Attention-based NMT models as feature functions in phrase-based SMT”. Proceedings of the First Conference on Machine Translation Berlin, Germany: Association for Computational Linguistics. 319–325. <http://www.aclweb.org/anthology/W/W16/W16-2316>10.18653/v1/W16-2316Search in Google Scholar

Junczys-Dowmunt, M., R. Grundkiewicz, T. Dwojak, H. Hoang, K. Heafield, T. Neckermann, F. Seide, et al. 2018. “Marian: Fast Neural Machine Translation in C++”. arXiv Preprint arXiv: 1804.00344<https://arxiv.org/abs/1804.00344>10.18653/v1/P18-4020Search in Google Scholar

Kingma, D. and J. Ba. 2014. “Adam: A method for stochastic optimization”. International Conference on Learning Representations December.Search in Google Scholar

Klubicka, F., A. Toral and V.M. Sánchez-Cartagena. 2017. “Fine-grained human evaluation of neural versus phrase-based machine translation”. CoRR abs/1706.04389. <http://arxiv.org/abs/1706.04389>10.1515/pralin-2017-0014Search in Google Scholar

Koehn, P. 2005. “Europarl: A parallel corpus for statistical machine translation”. MT Summit 5. 79–86.Search in Google Scholar

Koehn, P. 2009. Statistical Machine Translation Cambridge University Press.10.1017/CBO9780511815829Search in Google Scholar

Koehn, P. 2017. “Neural Machine Translation”. CoRR abs/1709.07809. <http://arxiv.org/abs/1709.07809>Search in Google Scholar

Koehn, P., H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, et al. 2007. “Moses: Open source toolkit for Statistical Machine Translation”. Proceedings of the 45th Annual Meeting of the Acl on Interactive Poster and Demonstration Sessions ACL ’07. Stroudsburg, PA: Association for Computational Linguistics. 177–180. <http://dl.acm.org/citation.cfm?id=1557769.1557821>10.3115/1557769.1557821Search in Google Scholar

Koehn, P. and R. Knowles. 2017. “Six challenges for Neural Machine Translation”. Proceedings of the First Workshop on Neural Machine Translation 28–39.10.18653/v1/W17-3204Search in Google Scholar

Lee, J. K. Cho and T. Hofmann. 2017. “Fully character-level Neural Machine Translation without explicit segmentation”. Transactions of the Association for Computational Linguistics 5. 365–378.10.1162/tacl_a_00067Search in Google Scholar

Mikolov, T., K. Chen, G. Corrado and J. Dean. 2013. “Efficient estimation of word representations in vector space”. CoRR abs/1301.3781. <http://dblp.uni-trier.de/db/journals/corr/corr1301dhtml#abs-1301-3781>Search in Google Scholar

Papineni, K., S. Roukos, T. Ward and W.-J. Zhu. 2002. “BLEU: A method for automatic evaluation of Machine Translation”. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. ACL ’02. Stroudsburg, PA: Association for Computational Linguistics. 311–318. <https://doi.org/10.3115/1073083.1073135>Search in Google Scholar

Sennrich, R. A. Birch, A. Currey, U. Germann, B. Haddow, K. Heafield, A. Valerio Miceli Barone and P. Williams. 2017. “The University of Edinburgh’s Neural MT systems for WMT17”. Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers Copenhagen: Association for Computational Linguistics. 389–399. <http://www.aclweb.org/anthology/W17-4739>10.18653/v1/W17-4739Search in Google Scholar

Sennrich, R., B. Haddow and A. Birch. 2016. “Neural Machine Translation of rare words with subword units”. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Berlin: Association for Computational Linguistics. 1715–1725. <https://doi.org/10.18653/vU/P16-1162>10.18653/v1/P16-1162Search in Google Scholar

Sutskever, I., O. Vinyals and Q.V. Le. 2014. “Sequence to sequence learning with neural networks”. In: Ghahramani, Z., M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger (eds.), Advances in neural information processing systems 27 Curran Associates, Inc. 3104–12. <http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf>Search in Google Scholar

Świeczkowska, P. 2017. “Towards a direct Japanese-Polish machine translation system”. Proceedings of the 8th Language & Technology Conference Poznań.Search in Google Scholar

Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N Gomez, Ł. Kaiser and I. Polosukhin. 2017. “Attention is all you need”. In: Guyon, I., U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan and R. Garnett (eds.), Advances in neural information processing systems 30 Curran Associates, Inc. 5998–6008. <http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf>Search in Google Scholar

Wołk, K. and K. Marasek. 2015. “PJAIT systems for the IWSLT 2015 evaluation campaign enhanced by comparable corpora”. Proceedings of the International Workshop on Spoken Language Translation December 3–4, 2015 Da Nang, Vietnam.Search in Google Scholar

Wołk, K. and K. Marasek. 2016. “PJAIT Systems for the WMT 2016”. Proceedings of the First Conference on Machine Translation10.18653/v1/W16-2328Search in Google Scholar

Wołk, K. and K. Marasek. 2017. “PJAIT’s Systems for WMT 2017 Conference”. Proceedings of the Second Conference on Machine Translation10.18653/v1/W17-4743Search in Google Scholar

Published Online: 2019-08-17

Published in Print: 2019-06-26

Statistical versus neural machine translation – a case study for a medium size domain-specific bilingual corpus

Abstract

References

Journal and Issue

Articles in the same Issue