Probing Multilingual Language Models for Discourse,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Probing Multilingual Language Models for Discourse
arXiv - CS - Computation and Language Pub Date : 2021-06-09 , DOI: arxiv-2106.04832
Murathan Kurfalı, Robert Östling

Pre-trained multilingual language models have become an important building block in multilingual natural language processing. In the present paper, we investigate a range of such models to find out how well they transfer discourse-level knowledge across languages. This is done with a systematic evaluation on a broader set of discourse-level tasks than has been previously been assembled. We find that the XLM-RoBERTa family of models consistently show the best performance, by simultaneously being good monolingual models and degrading relatively little in a zero-shot setting. Our results also indicate that model distillation may hurt the ability of cross-lingual transfer of sentence representations, while language dissimilarity at most has a modest effect. We hope that our test suite, covering 5 tasks with a total of 22 languages in 10 distinct families, will serve as a useful evaluation platform for multilingual performance at and beyond the sentence level.

中文翻译：

探讨话语的多语言语言模型

预训练的多语言语言模型已成为多语言自然语言处理的重要组成部分。在本文中，我们研究了一系列此类模型，以了解它们在跨语言传输语篇级知识的情况下的表现。这是通过对比以前组装的更广泛的话语级别任务进行系统评估来完成的。我们发现 XLM-RoBERTa 系列模型始终表现出最佳性能，同时是良好的单语模型，并且在零样本设置中降级相对较少。我们的结果还表明，模型蒸馏可能会损害句子表示的跨语言迁移能力，而语言差异最多只能产生适度的影响。我们希望我们的测试套件，

更新日期：2021-06-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>