当前位置: X-MOL 学术J. Braz. Comput. Soc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Detecting referential inconsistencies in electronic CV datasets
Journal of the Brazilian Computer Society Pub Date : 2017-02-23 , DOI: 10.1186/s13173-017-0052-0
Ivison C. Rubim , Vanessa Braganholo

One way to measure the scientific progress of a country is to evaluate the curriculum vitae (CV) of its researchers. In Brazil, this is not different. The Lattes Platform is an information system whose primary objective is to provide a single repository to store the CV of the Brazilian researchers. This system is increasingly acquiring expressiveness as the main source of information regarding the Brazilian community of researchers, students, managers, and other actors in the national system of science, technology, and innovation. However, the integrity of this important tool for gaging the national bibliographic production may be affected by the effect of ambiguities or referential inconsistencies in coauthoring citations. A first step towards solving this problem lies in identifying such inconsistencies. For that, we propose a heuristic-based approach that uses similarity search to match papers from coauthors of CV. We then use this technique to analyze over 2000 curricula of researchers from a given institution recovered from the Lattes Platform. The results indicate 18.98% of the analyzed publications present referential inconsistencies, which is a significant amount for a dataset that is supposed to be correct and trustable.

中文翻译:

检测电子简历数据集中的参考不一致

衡量一个国家科学进步的一种方法是评估其研究人员的简历 (CV)。在巴西,这也不例外。Lattes 平台是一个信息系统,其主要目标是提供一个单一的存储库来存储巴西研究人员的简历。该系统越来越具有表现力,作为巴西研究人员、学生、管理人员和国家科学、技术和创新系统中其他参与者社区的主要信息来源。然而,这一用于衡量国家书目生产的重要工具的完整性可能会受到合着引用中的歧义或参考不一致的影响。解决这个问题的第一步在于识别这种不一致。为了那个原因,我们提出了一种基于启发式的方法,该方法使用相似性搜索来匹配来自 CV 合著者的论文。然后,我们使用这种技术来分析从 Lattes 平台恢复的给定机构的 2000 多个研究人员课程。结果表明,18.98% 的分析出版物存在参考不一致,这对于一个应该是正确和可信的数据集来说是一个很大的数量。
更新日期:2017-02-23
down
wechat
bug