当前位置: X-MOL 学术arXiv.cs.CY › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A dataset of mentorship in science with semantic and demographic estimations
arXiv - CS - Computers and Society Pub Date : 2021-06-11 , DOI: arxiv-2106.06487
Qing Ke, Lizhen Liang, Ying Ding, Stephen V. David, Daniel E. Acuna

Mentorship in science is crucial for topic choice, career decisions, and the success of mentees and mentors. Typically, researchers who study mentorship use article co-authorship and doctoral dissertation datasets. However, available datasets of this type focus on narrow selections of fields and miss out on early career and non-publication-related interactions. Here, we describe MENTORSHIP, a crowdsourced dataset of 743176 mentorship relationships among 738989 scientists across 112 fields that avoids these shortcomings. We enrich the scientists' profiles with publication data from the Microsoft Academic Graph and "semantic" representations of research using deep learning content analysis. Because gender and race have become critical dimensions when analyzing mentorship and disparities in science, we also provide estimations of these factors. We perform extensive validations of the profile--publication matching, semantic content, and demographic inferences. We anticipate this dataset will spur the study of mentorship in science and deepen our understanding of its role in scientists' career outcomes.

中文翻译:

具有语义和人口统计估计的科学指导数据集

科学指导对于主题选择、职业决策以及学员和导师的成功至关重要。通常,研究导师制的研究人员使用文章合着和博士论文数据集。然而,这种类型的可用数据集专注于狭窄的领域选择,而错过了早期的职业和非出版相关的互动。在这里,我们描述了 MENTORSHIP,这是一个众包数据集,其中包含 112 个领域的 738989 名科学家之间的 743176 条导师关系,可以避免这些缺点。我们使用来自 Microsoft Academic Graph 的出版物数据和使用深度学习内容分析的研究的“语义”表示来丰富科学家的资料。因为在分析科学领域的指导和差异时,性别和种族已成为关键维度,我们还提供对这些因素的估计。我们对个人资料进行了广泛的验证——出版物匹配、语义内容和人口统计推断。我们预计该数据集将促进对科学指导的研究,并加深我们对其在科学家职业成果中的作用的理解。
更新日期:2021-06-14
down
wechat
bug