当前位置: X-MOL 学术Lang. Resour. Eval. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A massively parallel corpus: the Bible in 100 languages.
Language Resources and Evaluation ( IF 2.7 ) Pub Date : 2014-11-19 , DOI: 10.1007/s10579-014-9287-y
Christos Christodouloupoulos 1 , Mark Steedman 2
Affiliation  

We describe the creation of a massively parallel corpus based on 100 translations of the Bible. We discuss some of the difficulties in acquiring and processing the raw material as well as the potential of the Bible as a corpus for natural language processing. Finally we present a statistical analysis of the corpora collected and a detailed comparison between the English translation and other English corpora.

中文翻译:

大量平行的语料库:100种语言的圣经。

我们描述了基于圣经的100种译本的大规模平行语料库的创建。我们讨论了在获取和处理原材料方面的一些困难,以及圣经作为自然语言处理的语料库的潜力。最后,我们对收集的语料库进行了统计分析,并对英语翻译和其他英语语料库之间进行了详细的比较。
更新日期:2014-11-19
down
wechat
bug