当前位置: X-MOL 学术Journal of Data and Information Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Novel Method for Resolving and Completing Authors’ Country Affiliation Data in Bibliographic Records
Journal of Data and Information Science Pub Date : 2020-07-09 , DOI: 10.2478/jdis-2020-0020
Ba Xuan Nguyen 1, 2 , Jesse David Dinneen 1, 3 , Markus Luczak-Roesch 1, 4
Affiliation  

Abstract Purpose Our work seeks to overcome data quality issues related to incomplete author affiliation data in bibliographic records in order to support accurate and reliable measurement of international research collaboration (IRC). Design/methodology/approch We propose, implement, and evaluate a method that leverages the Web-based knowledge graph Wikidata to resolve publication affiliation data to particular countries. The method is tested with general and domain-specific data sets. Findings Our evaluation covers the magnitude of improvement, accuracy, and consistency. Results suggest the method is beneficial, reliable, and consistent, and thus a viable and improved approach to measuring IRC. Research limitations Though our evaluation suggests the method works with both general and domain-specific bibliographic data sets, it may perform differently with data sets not tested here. Further limitations stem from the use of the R programming language and R libraries for country identification as well as imbalanced data coverage and quality in Wikidata that may also change over time. Practical implications The new method helps to increase the accuracy in IRC studies and provides a basis for further development into a general tool that enriches bibliographic data using the Wikidata knowledge graph. Originality This is the first attempt to enrich bibliographic data using a peer-produced, Web-based knowledge graph like Wikidata.

中文翻译:

解决和完成书目记录中作者国家隶属关系数据的新方法

摘要目的我们的工作旨在克服与书目记录中不完整的作者隶属关系数据有关的数据质量问题,以支持对国际研究合作(IRC)进行准确可靠的衡量。设计/方法/方法我们提出,实施和评估一种方法,该方法利用基于Web的知识图Wikidata来解析特定国家/地区的出版物隶属关系数据。该方法已经过通用和特定领域数据集的测试。结果我们的评估涵盖了改进,准确性和一致性的程度。结果表明该方法是有益,可靠和一致的,因此是一种可行且改进的IRC测量方法。研究局限尽管我们的评估表明该方法适用于一般书目数据集和特定领域书目数据集,对于此处未测试的数据集,其性能可能会有所不同。进一步的限制来自使用R编程语言和R库进行国家/地区识别,以及Wikidata中数据覆盖范围和质量的不平衡,这种情况也可能随时间而变化。实际意义该新方法有助于提高IRC研究的准确性,并为进一步发展成为通用工具提供了基础,该工具使用Wikidata知识图丰富了书目数据。独创性这是使用像Wikidata这样的基于对等方的基于Web的知识图谱来丰富书目数据的首次尝试。实际意义该新方法有助于提高IRC研究的准确性,并为进一步发展成为通用工具提供了基础,该工具使用Wikidata知识图丰富了书目数据。独创性这是使用像Wikidata这样的基于对等方的基于Web的知识图谱来丰富书目数据的首次尝试。实际意义该新方法有助于提高IRC研究的准确性,并为进一步发展成为通用工具提供了基础,该工具使用Wikidata知识图丰富了书目数据。独创性这是使用像Wikidata这样的基于对等方的基于Web的知识图谱来丰富书目数据的首次尝试。
更新日期:2020-07-09
down
wechat
bug