Unbiased evaluation of ranking metrics reveals consistent performance in science and technology citation data,arXiv - CS - Information Retrieval

当前位置： X-MOL 学术 › arXiv.cs.IR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Unbiased evaluation of ranking metrics reveals consistent performance in science and technology citation data
arXiv - CS - Information Retrieval Pub Date : 2020-01-15 , DOI: arxiv-2001.05414
Shuqi Xu, Manuel Sebastian Mariani, Linyuan L\"u, Mat\'u\v{s} Medo

Despite the increasing use of citation-based metrics for research evaluation purposes, we do not know yet which metrics best deliver on their promise to gauge the significance of a scientific paper or a patent. We assess 17 network-based metrics by their ability to identify milestone papers and patents in three large citation datasets. We find that traditional information-retrieval evaluation metrics are strongly affected by the interplay between the age distribution of the milestone items and age biases of the evaluated metrics. Outcomes of these metrics are therefore not representative of the metrics' ranking ability. We argue in favor of a modified evaluation procedure that explicitly penalizes biased metrics and allows us to reveal metrics' performance patterns that are consistent across the datasets. PageRank and LeaderRank turn out to be the best-performing ranking metrics when their age bias is suppressed by a simple transformation of the scores that they produce, whereas other popular metrics, including citation count, HITS and Collective Influence, produce significantly worse ranking results.

中文翻译：

排名指标的公正评估揭示了科学和技术引文数据的一致表现

尽管越来越多地将基于引文的指标用于研究评估目的，但我们尚不知道哪些指标最能兑现其衡量科学论文或专利重要性的承诺。我们通过在三个大型引文数据集中识别里程碑论文和专利的能力来评估 17 个基于网络的指标。我们发现传统的信息检索评估指标受到里程碑项目的年龄分布与评估指标的年龄偏差之间的相互作用的强烈影响。因此，这些指标的结果并不代表指标的排名能力。我们赞成修改后的评估程序，该程序明确惩罚有偏见的指标，并允许我们揭示在整个数据集中一致的指标性能模式。

更新日期：2020-07-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>