当前位置: X-MOL 学术Med. Image Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A new baseline for retinal vessel segmentation: Numerical identification and correction of methodological inconsistencies affecting 100+ papers
Medical Image Analysis ( IF 10.9 ) Pub Date : 2021-11-13 , DOI: 10.1016/j.media.2021.102300
György Kovács 1 , Attila Fazekas 2
Affiliation  

In the last 15 years, the segmentation of vessels in retinal images has become an intensively researched problem in medical imaging, with hundreds of algorithms published. One of the de facto benchmarking data sets of vessel segmentation techniques is the DRIVE data set. Since DRIVE contains a predefined split of training and test images, the published performance results of the various segmentation techniques should provide a reliable ranking of the algorithms. Including more than 100 papers in the study, we performed a detailed numerical analysis of the coherence of the published performance scores. We found inconsistencies in the reported scores related to the use of the field of view (FoV), which has a significant impact on the performance scores. We attempted to eliminate the biases using numerical techniques to provide a more realistic picture of the state of the art. Based on the results, we have formulated several findings, most notably: despite the well-defined test set of DRIVE, most rankings in published papers are based on non-comparable figures; in contrast to the near-perfect accuracy scores reported in the literature, the highest accuracy score achieved to date is 0.9582 in the FoV region, which is 1% higher than that of human annotators. The methods we have developed for identifying and eliminating the evaluation biases can be easily applied to other domains where similar problems may arise.



中文翻译:

视网膜血管分割的新基线:影响 100 多篇论文的方法学不一致的数值识别和校正

在过去的 15 年中,视网膜图像中的血管分割已成为医学成像中一个深入研究的问题,已发表了数百种算法。事实上之一血管分割技术的基准数据集是 DRIVE 数据集。由于 DRIVE 包含训练和测试图像的预定义拆分,因此各种分割技术的已发布性能结果应提供算法的可靠排名。包括研究中的 100 多篇论文,我们对已发表的绩效分数的连贯性进行了详细的数值分析。我们发现与使用视野 (FoV) 相关的报告分数不一致,这对性能分数有重大影响。我们试图使用数值技术消除偏差,以提供更真实的艺术状态图。根据结果​​,我们制定了几个发现,最值得注意的是:尽管 DRIVE 的测试集定义明确,已发表论文中的大多数排名都是基于不可比较的数字;与文献中报道的近乎完美的准确度得分相比,迄今为止在 FoV 区域达到的最高准确度得分为 0.9582,比人类注释者高 1%。我们开发的用于识别和消除评估偏差的方法可以很容易地应用于可能出现类似问题的其他领域。

更新日期:2021-11-22
down
wechat
bug