当前位置: X-MOL 学术Inf. Process. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Combining deep neural network and bibliometric indicator for emerging research topic prediction
Information Processing & Management ( IF 8.6 ) Pub Date : 2021-04-30 , DOI: 10.1016/j.ipm.2021.102611
Zhentao Liang , Jin Mao , Kun Lu , Zhichao Ba , Gang Li

Predicting emerging research topics is important to researchers and policymakers. In this study, we propose a two-step solution to the problem of emerging topic prediction. The first step forecasts the future popularity score, a novel indicator reflecting the impact and growth, of candidate topics in a time-series manner. The second step selects novel topics from the candidates predicted to be popular in the first step. Terms with domain characteristics are used as candidate topics. Deep neural networks, specifically LSTM and NNAR, are applied with nine features of topics to predict popularity score. We evaluated the models and five baselines on two datasets from two perspectives, i.e., the ability to (1) predict the correct indicator value and (2) reconstruct the optimal ranking order. Two types of training strategies were compared, including a global strategy that trains a model with all topics and two local strategies that train separate models with different groups of topics. Our results show that LSTM and NNAR outperform other models in predicting the value of popularity score measured by MAE and RMSE, while LightGBM is a competitive baseline in ranking the topics in terms of NDCG@20. The performance difference of global and local strategies is not significant. Emerging topics predicted by our approach are compared with those by other methods. A qualitative assessment on nominated emerging topics suggests topics nominated by machine learning methods are more alike than those by the rule-based model. Some important topics are nominated according to a preliminary literature analysis. This study exploited the strengths of both machine learning and bibliometric indicator approaches for emerging topic prediction. Deep neural networks are applied where objective optimization target can be defined and measured. Bibliometric indicator offers an efficient way to select novel topics from candidates. The hybrid approach shows promise in considering various characteristics of emerging topics when making predictions.



中文翻译:

结合深度神经网络和文献计量指标进行新兴研究主题预测

预测新兴的研究主题对研究人员和政策制定者很重要。在这项研究中,我们提出了针对新兴主题预测问题的两步解决方案。第一步以时序方式预测候选主题的未来受欢迎程度得分,这是反映影响和增长的新颖指标。第二步从预计会在第一步中流行的候选人中选择新颖的主题。具有领域特征的术语用作候选主题。深度神经网络(特别是LSTM和NNAR)与主题的九种功能一起应用,以预测受欢迎程度得分。我们从两个角度评估了两个数据集上的模型和五个基线,即(1)预测正确的指标值和(2)重构最佳排名顺序的能力。比较了两种培训策略:包括训练所有主题的模型的全局策略,以及训练具有不同主题组的单独模型的两个局部策略。我们的结果表明,LSTM和NNAR在预测由MAE和RMSE衡量的受欢迎程度得分的价值方面优于其他模型,而LightGBM是将话题按NDCG @ 20排名的竞争基准。全球策略和本地策略的效果差异不明显。我们的方法预测的新兴主题与其他方法的主题进行了比较。对提名的新兴主题的定性评估表明,与基于规则的模型相比,机器学习方法提名的主题更相似。根据初步的文献分析,提名了一些重要的主题。这项研究利用了机器学习和文献计量指标方法在新兴主题预测中的优势。可以在可以定义和测量目标优化目标的地方应用深度神经网络。文献计量指标提供了一种从候选人中选择新颖主题的有效方法。混合方法在做出预测时考虑到新兴主题的各种特征时显示出希望。

更新日期:2021-04-30
down
wechat
bug