当前位置: X-MOL 学术J. Informetr. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mining Temporal Evolution of Knowledge Graphs and Genealogical Features for Literature-based Discovery Prediction
Journal of Informetrics ( IF 3.7 ) Pub Date : 2020-06-30 , DOI: 10.1016/j.joi.2020.101057
Nazim Choudhury , Fahim Faisal , Matloob Khushi

Literature-based discovery process identifies the important but implicit relations among information embedded in published literature. Existing techniques from Information Retrieval (IR) and Natural Language Processing (NLP) attempt to identify the hidden or unpublished connections between information concepts within published literature, however, these techniques overlooked the concept of predicting the future and emerging relations among scientific knowledge components such as author selected keywords encapsulated within the literature. Keyword Co-occurrence Network (KCN), built upon author selected keywords, is considered as a knowledge graph that focuses both on these knowledge components and knowledge structure of a scientific domain by examining the relationships between knowledge entities. Using data from two multidisciplinary research domains other than the bio-medical domain, and capitalizing on bibliometrics, the dynamicity of temporal KCNs, and a recurrent neural network, this study develops some novel features supportive for the prediction of the future literature-based discoveries - the emerging connections (co-appearances in the same article) among keywords. Temporal importance extracted from both bipartite and unipartite networks, communities defined by genealogical relations, and the relative importance of temporal citation counts were used in the feature construction process. Both node and edge-level features were input into a recurrent neural network to forecast the feature values and predict the future relations between different scientific concepts/topics represented by the author selected keywords. High performance rates, compared both against contemporary heterogeneous network-based method and preferential attachment process, suggest that these features complement both the prediction of future literature-based discoveries and emerging trend analysis.



中文翻译:

基于文献的发现预测的知识图和族谱特征的时空演化

基于文献的发现过程可以识别嵌入在已出版文献中的信息之间的重要但不明显的关系。信息检索(IR)和自然语言处理(NLP)的现有技术试图识别已发表文献中信息概念之间隐藏或未发表的联系,但是,这些技术忽略了预测诸如科学知识组件之间的未来和新兴关系的概念。作者选择封装在文献中的关键字。以作者选择的关键字为基础的关键字共现网络(KCN)被认为是一种知识图,通过检查知识实体之间的关系,重点关注这些知识成分和科学领域的知识结构。利用生物医学领域以外的两个跨学科研究领域的数据,并利用文献计量学,时态KCN的动态性和递归神经网络,这项研究开发了一些新颖的功能,可支持对未来基于文献的发现进行预测-关键字之间的新兴联系(同一文章中的共同出现)。从二部和单部网络中提取的时间重要性,族谱关系定义的社区以及时间引用计数的相对重要性都用于特征构建过程中。节点和边缘级特征都输入到递归神经网络中,以预测特征值并预测作者选择的关键字代表的不同科学概念/主题之间的未来关系。

更新日期:2020-06-30
down
wechat
bug