当前位置: X-MOL 学术Scientometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Navigation-based candidate expansion and pretrained language models for citation recommendation
Scientometrics ( IF 3.9 ) Pub Date : 2020-10-10 , DOI: 10.1007/s11192-020-03718-9
Rodrigo Nogueira , Zhiying Jiang , Kyunghyun Cho , Jimmy Lin

Citation recommendation systems for the scientific literature, to help authors find papers that should be cited, have the potential to speed up discoveries and uncover new routes for scientific exploration. We treat this task as a ranking problem, which we tackle with a two-stage approach: candidate generation followed by re-ranking. Within this framework, we adapt to the scientific domain a proven combination based on "bag of words" retrieval followed by re-scoring with a BERT model. We experimentally show the effects of domain adaptation, both in terms of pretraining on in-domain data and exploiting in-domain vocabulary. In addition, we introduce a novel navigation-based document expansion strategy to enrich the candidate documents processed by our neural models. On three different collections from different scientific disciplines, we achieve the best-reported results in the citation recommendation task.

中文翻译:

基于导航的候选扩展和用于引文推荐的预训练语言模型

科学文献的引文推荐系统,帮助作者找到应该被引用的论文,有可能加速发现并发现科学探索的新途径。我们将此任务视为排名问题,我们采用两阶段方法解决该问题:生成候选者,然后重新排名。在这个框架内,我们将基于“词袋”检索的经过验证的组合适应科学领域,然后使用 BERT 模型重新评分。我们通过实验展示了域适应的影响,包括域内数据的预训练和域内词汇的利用。此外,我们引入了一种新颖的基于导航的文档扩展策略来丰富我们的神经模型处理的候选文档。在来自不同科学学科的三个不同的馆藏中,
更新日期:2020-10-10
down
wechat
bug