Using Replicates in Information Retrieval Evaluation,ACM Transactions on Information Systems

当前位置： X-MOL 学术 › ACM Trans. Inf. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Using Replicates in Information Retrieval Evaluation
ACM Transactions on Information Systems ( IF 5.4 ) Pub Date : 2017-08-29 , DOI: 10.1145/3086701
Ellen M Voorhees ₁ , Daniel Samarov ₁ , Ian Soboroff ₁

Affiliation

This article explores a method for more accurately estimating the main effect of the system in a typical test-collection-based evaluation of information retrieval systems, thus increasing the sensitivity of system comparisons. Randomly partitioning the test document collection allows for multiple tests of a given system and topic (replicates). Bootstrap ANOVA can use these replicates to extract system-topic interactions—something not possible without replicates—yielding a more precise value for the system effect and a narrower confidence interval around that value. Experiments using multiple TREC collections demonstrate that removing the topic-system interactions substantially reduces the confidence intervals around the system effect as well as increases the number of significant pairwise differences found. Further, the method is robust against small changes in the number of partitions used, against variability in the documents that constitute the partitions, and the measure of effectiveness used to quantify system effectiveness.

中文翻译：

在信息检索评估中使用复制

本文探讨了一种在典型的基于测试集的信息检索系统评估中更准确地估计系统主效应的方法，从而提高系统比较的灵敏度。随机划分测试文档集合允许对给定系统和主题（复制）进行多次测试。Bootstrap ANOVA 可以使用这些重复来提取系统-主题交互——如果没有重复，这是不可能的——产生更精确的系统效应值和更窄的围绕该值的置信区间。使用多个 TREC 集合的实验表明，删除主题-系统交互会大大降低围绕系统效应的置信区间，并增加发现的显着成对差异的数量。进一步，

更新日期：2017-08-29

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11