Can non-specialists provide high quality gold standard labels in challenging modalities?,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Can non-specialists provide high quality gold standard labels in challenging modalities?
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-07-30 , DOI: arxiv-2107.14682
Samuel Budd, Thomas Day, John Simpson, Karen Lloyd, Jacqueline Matthew, Emily Skelton, Reza Razavi, Bernhard Kainz

Probably yes. -- Supervised Deep Learning dominates performance scores for many computer vision tasks and defines the state-of-the-art. However, medical image analysis lags behind natural image applications. One of the many reasons is the lack of well annotated medical image data available to researchers. One of the first things researchers are told is that we require significant expertise to reliably and accurately interpret and label such data. We see significant inter- and intra-observer variability between expert annotations of medical images. Still, it is a widely held assumption that novice annotators are unable to provide useful annotations for use by clinical Deep Learning models. In this work we challenge this assumption and examine the implications of using a minimally trained novice labelling workforce to acquire annotations for a complex medical image dataset. We study the time and cost implications of using novice annotators, the raw performance of novice annotators compared to gold-standard expert annotators, and the downstream effects on a trained Deep Learning segmentation model's performance for detecting a specific congenital heart disease (hypoplastic left heart syndrome) in fetal ultrasound imaging.

中文翻译：

非专家能否以具有挑战性的方式提供高质量的金标准标签？

大概是。-- 监督式深度学习在许多计算机视觉任务的性能得分中占主导地位，并定义了最先进的技术。然而，医学图像分析落后于自然图像应用。众多原因之一是缺乏可供研究人员使用的经过良好注释的医学图像数据。研究人员被告知的第一件事是，我们需要大量的专业知识来可靠、准确地解释和标记此类数据。我们看到医学图像的专家注释之间存在显着的观察者间和观察者内差异。尽管如此，人们普遍认为，新手注释者无法提供有用的注释以供临床深度学习模型使用。在这项工作中，我们挑战了这一假设，并研究了使用受过最低限度训练的新手标记劳动力来获取复杂医学图像数据集注释的含义。我们研究了使用新手注释器的时间和成本影响、新手注释器与黄金标准专家注释器相比的原始性能，以及对训练有素的深度学习分割模型检测特定先天性心脏病（左心发育不良综合征）性能的下游影响) 在胎儿超声成像中。

更新日期：2021-08-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文