当前位置: X-MOL 学术IEEE Trans. Netw. Sci. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Maximum Value Matters: Finding Hot Topics in Scholarly Fields
IEEE Transactions on Network Science and Engineering ( IF 6.6 ) Pub Date : 2020-10-01 , DOI: 10.1109/tnse.2020.3022172
Guie Meng , Jiasheng Xu , Jinghao Zhao , Luoyi Fu , Huan Long , Xiaoying Gan , Xinbing Wang

Finding hot topics in scholarly fields can help researchers to keep up with the latest concepts, trends, and inventions in their field of interest. Due to the rarity of complete large-scale scholarly data, earlier studies target this problem based on manual topic extraction from a limited number of domains, with their focus solely on a single feature such as coauthorship, citation relations, and etc. Given the compromised effectiveness of such predictions, in this paper we use a real scholarly dataset from Microsoft Academic Graph, which provides more than 12000 topics in the field of Computer Science (CS), including 1200 venues, 14.4 million authors, 30 million papers and their citation relations over the period of 1950 till now. Aiming to find the topics that will trend in CS area, we innovatively formalize a hot topic prediction problem where, with joint consideration of both inter- and intra-topical influence, 17 different scientific features are extracted for comprehensive description of topic status. By leveraging all those 17 features, we observe good accuracy of topic scale forecasting after 5 and 10 years with R2 values of 0.9893 and 0.9646, respectively. Interestingly, our prediction suggests that the maximum value matters in finding hot topics in scholarly fields, primarily from three aspects: (1) the maximum value of each factor, such as authors' maximum h-index and largest citation number, provides three times the amount of information than the average value in prediction; (2) the mutual influence between the most correlated topics serve as the most telling factor in long-term topic trend prediction, interpreting that those currently exhibiting the maximum growth rates will drive the correlated topics to be hot in the future; (3) we predict in the next 5 years the top 100 fastest growing (maximum growth rate) topics that will potentially get the major attention in CS area.

中文翻译:

最大值很重要:在学术领域寻找热门话题

寻找学术领域的热门话题可以帮助研究人员跟上他们感兴趣领域的最新概念、趋势和发明。由于完整的大规模学术数据的稀有性,早期的研究基于从有限数量的域中手动提取主题来解决这个问题,他们只关注单一特征,例如合着、引文关系等。此类预测的有效性,在本文中,我们使用来自 Microsoft Academic Graph 的真实学术数据集,该数据集提供了计算机科学 (CS) 领域的 12000 多个主题,包括 1200 个地点、1440 万作者、3000 万篇论文及其引用关系1950 年至今。为了找到 CS 领域的热门话题,我们创新地形式化了一个热门话题预测问题,其中,综合考虑主题间和主题内的影响,提取了 17 种不同的科学特征,以全面描述主题状态。通过利用所有这 17 个特征,我们观察到了 5 年和 10 年后主题规模预测的良好准确性,R2 值分别为 0.9893 和 0.9646。有趣的是,我们的预测表明,在寻找学术领域的热点话题时,最大值很重要,主要来自三个方面:(1)每个因素的最大值,例如作者的最大 h-index 和最大引用数,提供了三倍于信息量大于预测中的平均值;(2) 最相关话题之间的相互影响是长期话题趋势预测中最有说服力的因素,解读当前表现出最大增长率的那些将推动相关主题在未来成为热点;(3) 我们预测在未来 5 年内,前 100 个增长最快(增长速度最大)的主题可能会在 CS 领域得到主要关注。
更新日期:2020-10-01
down
wechat
bug