当前位置: X-MOL 学术Semant. Web › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Empirical methodology for crowdsourcing ground truth
Semantic Web ( IF 3 ) Pub Date : 2020-12-07 , DOI: 10.3233/sw-200415
Anca Dumitrache 1, 2 , Oana Inel 1, 3 , Benjamin Timmermans 1, 4 , Carlos Ortiz 5 , Robert-Jan Sips 4, 6 , Lora Aroyo 1, 7 , Chris Welty 1, 7
Affiliation  

The process of gathering ground truth data through human annotation is a major bottleneck in the use of information extraction methods for populating the Semantic Web. Crowdsourcing-based approaches are gaining popularity in the attempt to solve the issues related to volume of data and lack of annotators. Typically these practices use inter-annotator agreement as a measure of quality. However, in many domains, such as event detection, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. We present an empirically derived methodology for efficiently gathering of ground truth data in a diverse set of use cases covering a variety of domains and annotation tasks. Central to our approach is the use of CrowdTruth metrics that capture inter-annotator disagreement. We show that measuring disagreement is essential for acquiring a high quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with majority vote, over a set of diverse crowdsourcing tasks: Medical Relation Extraction, Twitter Event Identification, News Event Extraction and Sound Interpretation. We also show that an increased number of crowd workers leads to growth and stabilization in the quality of annotations, going against the usual practice of employing a small number of annotators.

中文翻译:

众包基础事实的经验方法

通过人类注释收集地面真相数据的过程是使用信息提取方法填充语义网的主要瓶颈。在解决与数据量和缺少注释器有关的问题的尝试中,基于众包的方法越来越受欢迎。通常,这些做法使用注释者之间的协议作为质量度量。但是,在许多领域中,例如事件检测,数据中存在歧义,并且信息示例的观点也多种多样。我们提出了一种经验派生的方法,可以有效地收集涵盖各种领域和注释任务的各种用例中的地面真理数据。我们方法的核心是使用CrowdTruth指标来捕获注释者之间的分歧。我们表明,测量分歧对于获取高质量的地面真理至关重要。我们通过在一系列不同的众包任务(医疗关系提取,Twitter事件标识,新闻事件提取和声音解释)上比较以多数票通过CrowdTruth指标聚合的数据质量来实现这一目标。我们还表明,大量的人群工作者导致注释质量的增长和稳定,这与使用少量注释器的通常做法背道而驰。
更新日期:2020-12-08
down
wechat
bug