Data credit distribution: A new method to estimate databases impact,Journal of Informetrics

当前位置： X-MOL 学术 › J. Informetr. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Data credit distribution: A new method to estimate databases impact
Journal of Informetrics ( IF 3.7 ) Pub Date : 2020-08-23 , DOI: 10.1016/j.joi.2020.101080
Dennis Dosso , Gianmaria Silvello

It is widely accepted that data is fundamental for research and should therefore be cited as textual scientific publications. However, issues like data citation, handling and counting the credit generated by such citations, remain open research questions.

Data credit is a new measure of value built on top of data citation, which enables us to annotate data with a value, representing its importance. Data credit can be considered as a new tool that, together with traditional citations, helps to recognize the value of data and its creators in a world that is ever more depending on data.

In this paper we define data credit distribution (DCD) as a process by which credit generated by citations is given to the single elements of a database. We focus on a scenario where a paper cites data from a database obtained by issuing a query. The citation generates credit which is then divided among the database entities responsible for generating the query output. One key aspect of our work is to credit not only the explicitly cited entities, but even those that contribute to their existence, but which are not accounted in the query output.

We propose a data credit distribution strategy (CDS) based on data provenance and implement a system that uses the information provided by data citations to distribute the credit in a relational database accordingly.

As use case and for evaluation purposes, we adopt the IUPHAR/BPS Guide to Pharmacology (GtoPdb), a curated relational database. We show how credit can be used to highlight areas of the database that are frequently used. Moreover, we also underline how credit rewards data and authors based on their research impact, and not merely on the number of citations. This can lead to designing new bibliometrics for data citations.

中文翻译：

数据信用分配：一种估计数据库影响的新方法

人们普遍认为数据是研究的基础，因此应被引用为文本科学出版物。但是，诸如数据引用，处理和计算此类引用所产生的信用等问题仍然是未解决的研究问题。

数据信用是在数据引用基础上建立的一种新的价值衡量标准，它使我们能够使用代表其重要性的价值来注释数据。数据信用可以被认为是一种新工具，它与传统引用一起，有助于在一个越来越依赖数据的世界中认识到数据及其创造者的价值。

在本文中，我们将数据信用分配（DCD）定义为将引用产生的信用分配给数据库的单个元素的过程。我们关注的场景是论文引用了通过发出查询而获得的数据库中的数据。引用产生信用，然后信用被分配给负责产生查询输出的数据库实体。我们工作的一个关键方面不仅要归功于明确引用的实体，还要归功于那些有助于其存在但在查询输出中不予考虑的实体。

我们提出了一种基于数据出处的数据信用分配策略（CDS），并实现了一个使用数据引用提供的信息在相关数据库中相应分配信用的系统。

作为用例并出于评估目的，我们采用了IUPHAR / BPS药理指南（GtoPdb），这是一种精选的关系数据库。我们展示了如何使用信用来突出显示数据库中经常使用的区域。此外，我们还强调了信用如何根据数据和作者的研究影响而不是仅基于引用次数来奖励数据和作者。这可能会导致为数据引用设计新的文献计量法。

更新日期：2020-08-23

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南