Testing pre-trained Transformer models for Lithuanian news clustering,arXiv - CS - Information Retrieval

当前位置： X-MOL 学术 › arXiv.cs.IR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Testing pre-trained Transformer models for Lithuanian news clustering
arXiv - CS - Information Retrieval Pub Date : 2020-04-03 , DOI: arxiv-2004.03461
Lukas Stankevi\v{c}ius and Mantas Luko\v{s}evi\v{c}ius

A recent introduction of Transformer deep learning architecture made breakthroughs in various natural language processing tasks. However, non-English languages could not leverage such new opportunities with the English text pre-trained models. This changed with research focusing on multilingual models, where less-spoken languages are the main beneficiaries. We compare pre-trained multilingual BERT, XLM-R, and older learned text representation methods as encodings for the task of Lithuanian news clustering. Our results indicate that publicly available pre-trained multilingual Transformer models can be fine-tuned to surpass word vectors but still score much lower than specially trained doc2vec embeddings.

中文翻译：

为立陶宛新闻聚类测试预训练的 Transformer 模型

最近引入的 Transformer 深度学习架构在各种自然语言处理任务中取得了突破。然而，非英语语言无法利用英语文本预训练模型利用这些新机会。随着针对多语言模型的研究发生了变化，其中较少使用的语言是主要受益者。我们将预训练的多语言 BERT、XLM-R 和旧的学习文本表示方法作为立陶宛新闻聚类任务的编码进行比较。我们的结果表明，公开可用的预训练多语言 Transformer 模型可以进行微调以超越词向量，但仍远低于专门训练的 doc2vec 嵌入。

更新日期：2020-10-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>