当前位置: X-MOL 学术Med. Image Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Does your dermatology classifier know what it doesn’t know? Detecting the long-tail of unseen conditions
Medical Image Analysis ( IF 10.9 ) Pub Date : 2021-10-20 , DOI: 10.1016/j.media.2021.102274
Abhijit Guha Roy 1 , Jie Ren 2 , Shekoofeh Azizi 1 , Aaron Loh 1 , Vivek Natarajan 1 , Basil Mustafa 2 , Nick Pawlowski 1 , Jan Freyberg 1 , Yuan Liu 1 , Zach Beaver 1 , Nam Vo 1 , Peggy Bui 1 , Samantha Winter 1 , Patricia MacWilliams 1 , Greg S Corrado 1 , Umesh Telang 1 , Yun Liu 1 , Taylan Cemgil 3 , Alan Karthikesalingam 1 , Balaji Lakshminarayanan 2 , Jim Winkens 1
Affiliation  

Supervised deep learning models have proven to be highly effective in classification of dermatological conditions. These models rely on the availability of abundant labeled training examples. However, in the real-world, many dermatological conditions are individually too infrequent for per-condition classification with supervised learning. Although individually infrequent, these conditions may collectively be common and therefore are clinically significant in aggregate. To prevent models from generating erroneous outputs on such examples, there remains a considerable unmet need for deep learning systems that can better detect such infrequent conditions. These infrequent ‘outlier’ conditions are seen very rarely (or not at all) during training. In this paper, we frame this task as an out-of-distribution (OOD) detection problem. We set up a benchmark ensuring that outlier conditions are disjoint between the model training, validation, and test sets. Unlike traditional OOD detection benchmarks where the task is to detect dataset distribution shift, we aim at the more challenging task of detecting subtle differences resulting from a different pathology or condition. We propose a novel hierarchical outlier detection (HOD) loss, which assigns multiple abstention classes corresponding to each training outlier class and jointly performs a coarse classification of inliers vs. outliers, along with fine-grained classification of the individual classes. We demonstrate that the proposed HOD loss based approach outperforms leading methods that leverage outlier data during training. Further, performance is significantly boosted by using recent representation learning methods (BiT, SimCLR, MICLe). Further, we explore ensembling strategies for OOD detection and propose a diverse ensemble selection process for the best result. We also perform a subgroup analysis over conditions of varying risk levels and different skin types to investigate how OOD performance changes over each subgroup and demonstrate the gains of our framework in comparison to baseline. Furthermore, we go beyond traditional performance metrics and introduce a cost matrix for model trust analysis to approximate downstream clinical impact. We use this cost matrix to compare the proposed method against the baseline, thereby making a stronger case for its effectiveness in real-world scenarios.



中文翻译:

你的皮肤科分类器知道它不知道的东西吗?检测看不见的条件的长尾

有监督的深度学习模型已被证明在皮肤病分类方面非常有效。这些模型依赖于大量标记训练示例的可用性。然而,在现实世界中,许多皮肤病状况对于使用监督学习的每个状况分类来说都太少见了。虽然个别不常见,但这些情况可能共同普遍,因此总体上具有临床意义。为了防止模型在此类示例上生成错误输出,对于能够更好地检测此类罕见情况的深度学习系统仍然存在相当大的未满足需求。这些不常见的“异常值”条件在训练期间很少出现(或根本没有)。在本文中,我们将此任务定义为分布外 (OOD) 检测问题。我们建立了一个基准,确保模型训练、验证和测试集之间的异常条件是不相交的。与传统 OOD 检测基准的任务是检测数据集分布变化不同,我们的目标是检测由不同病理或状况引起的细微差异这一更具挑战性的任务。我们提出了一种新颖的分层异常值检测 (HOD) 损失,它分配与每个训练异常值类对应的多个弃权类,并联合执行内点的粗分类 我们的目标是检测由不同病理或状况引起的细微差异,这是一项更具挑战性的任务。我们提出了一种新颖的分层异常值检测 (HOD) 损失,它分配与每个训练异常值类对应的多个弃权类,并联合执行内点的粗分类 我们的目标是检测由不同病理或状况引起的细微差异,这是一项更具挑战性的任务。我们提出了一种新颖的分层异常值检测 (HOD) 损失,它分配与每个训练异常值类对应的多个弃权类,并联合执行内点的粗分类对比异常值,以及各个类的细粒度分类。我们证明了所提出的基于 HOD 损失的方法优于在训练期间利用异常数据的领先方法。此外,使用最近的表示学习方法(BiT、SimCLR、MICLe)显着提高了性能。此外,我们探索了 OOD 检测的集成策略,并提出了一个多样化的集成选择过程以获得最佳结果。我们还对不同风险水平和不同皮肤类型的条件进行了亚组分析,以研究 OOD 性能在每个亚组中的变化情况,并展示我们的框架与基线相比的优势。此外,我们超越了传统的性能指标,并为模型信任分析引入了一个成本矩阵,以估计下游临床影响。

更新日期:2021-10-30
down
wechat
bug