当前位置: X-MOL 学术Scientometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A deep-learning based citation count prediction model with paper metadata semantic features
Scientometrics ( IF 3.5 ) Pub Date : 2021-06-05 , DOI: 10.1007/s11192-021-04033-7
Anqi Ma , Yu Liu , Xiujuan Xu , Tao Dong

Predicting the impact of academic papers can help scholars quickly identify the high-quality papers in the field. How to develop efficient predictive model for evaluating potential papers has attracted increasing attention in academia. Many studies have shown that early citations contribute to improving the performance of predicting the long-term impact of a paper. Besides early citations, some bibliometric features and altmetric features have also been explored for predicting the impact of academic papers. Furthermore, paper metadata text such as title, abstract and keyword contains valuable information which has effect on its citation count. However, present studies ignore the semantic information contained in the metadata text. In this paper, we propose a novel citation prediction model based on paper metadata text to predict the long-term citation count, and the core of our model is to obtain the semantic information from the metadata text. We use deep learning techniques to encode the metadata text, and then further extract high-level semantic features for learning the citation prediction task. We also integrate early citations for improving the prediction performance of the model. We show that our proposed model outperforms the state-of-the-art models in predicting the long-term citation count of the papers, and metadata semantic features are effective for improving the accuracy of the citation prediction models.



中文翻译:

基于深度学习的论文元数据语义特征引文计数预测模型

预测学术论文的影响力可以帮助学者快速识别该领域的高质量论文。如何开发有效的预测模型来评估潜在论文已引起学术界越来越多的关注。许多研究表明,早期引用有助于提高预测论文长期影响的性能。除了早期引用之外,还探索了一些文献计量特征和替代计量特征来预测学术论文的影响。此外,论文元数据文本(如标题、摘要和关键字)包含对其引用计数有影响的有价值的信息。然而,目前的研究忽略了元数据文本中包含的语义信息。在本文中,我们提出了一种新的基于论文元数据文本的引文预测模型来预测长期引用计数,我们模型的核心是从元数据文本中获取语义信息。我们使用深度学习技术对元数据文本进行编码,然后进一步提取高级语义特征来学习引文预测任务。我们还整合了早期引用以提高模型的预测性能。我们表明,我们提出的模型在预测论文的长期引用计数方面优于最先进的模型,并且元数据语义特征可有效提高引用预测模型的准确性。然后进一步提取高级语义特征用于学习引文预测任务。我们还整合了早期引用以提高模型的预测性能。我们表明,我们提出的模型在预测论文的长期引用计数方面优于最先进的模型,并且元数据语义特征可有效提高引用预测模型的准确性。然后进一步提取高级语义特征用于学习引文预测任务。我们还整合了早期引用以提高模型的预测性能。我们表明,我们提出的模型在预测论文的长期引用计数方面优于最先进的模型,并且元数据语义特征可有效提高引用预测模型的准确性。

更新日期:2021-06-05
down
wechat
bug