当前位置: X-MOL 学术arXiv.cs.DL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evaluating the state-of-the-art in mapping research spaces: a Brazilian case study
arXiv - CS - Digital Libraries Pub Date : 2021-04-07 , DOI: arxiv-2104.03338
Francisco Galuppo Azevedo, Fabricio Murai

Scientific knowledge cannot be seen as a set of isolated fields, but as a highly connected network. Understanding how research areas are connected is of paramount importance for adequately allocating funding and human resources (e.g., assembling teams to tackle multidisciplinary problems). The relationship between disciplines can be drawn from data on the trajectory of individual scientists, as researchers often make contributions in a small set of interrelated areas. Two recent works propose methods for creating research maps from scientists' publication records: by using a frequentist approach to create a transition probability matrix; and by learning embeddings (vector representations). Surprisingly, these models were evaluated on different datasets and have never been compared in the literature. In this work, we compare both models in a systematic way, using a large dataset of publication records from Brazilian researchers. We evaluate these models' ability to predict whether a given entity (scientist, institution or region) will enter a new field w.r.t. the area under the ROC curve. Moreover, we analyze how sensitive each method is to the number of publications and the number of fields associated to one entity. Last, we conduct a case study to showcase how these models can be used to characterize science dynamics in the context of Brazil.

中文翻译:

评估制图研究空间的最新技术:巴西案例研究

科学知识不能被视为一组孤立的领域,而可以看作是一个高度连接的网络。了解研究领域之间的联系方式对于充分分配资金和人力资源至关重要(例如,组建团队以解决多学科问题)。学科之间的关系可以从单个科学家的轨迹数据中得出,因为研究人员经常在一小部分相互关联的领域中做出贡献。最近的两项工作提出了从科学家的公开记录中创建研究图的方法:通过使用惯常方法来创建转移概率矩阵;并通过学习嵌入(矢量表示)。令人惊讶的是,这些模型是在不同的数据集上进行评估的,从未在文献中进行过比较。在这项工作中,我们使用巴西研究人员的大量出版物记录,以系统的方式比较了这两种模型。我们评估这些模型预测给定实体(科学家,机构或地区)是否将进入ROC曲线下面积的新字段的能力。此外,我们分析了每种方法对出版物数量和与一个实体相关联的字段数量的敏感程度。最后,我们进行了一个案例研究,以展示如何在巴西的背景下使用这些模型来刻画科学动态。我们分析了每种方法对发布数量和与一个实体相关联的字段数量的敏感程度。最后,我们进行了一个案例研究,以展示如何在巴西的背景下使用这些模型来刻画科学动态。我们分析了每种方法对发布数量和与一个实体相关联的字段数量的敏感程度。最后,我们进行了一个案例研究,以展示如何在巴西的背景下使用这些模型来刻画科学动态。
更新日期:2021-04-09
down
wechat
bug