Validation and Clinical Applicability of Whole-Volume Automated Segmentation of Optical Coherence Tomography in Retinal Disease Using Deep Learning,JAMA Ophthalmology

当前位置： X-MOL 学术 › JAMA Ophthalmol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Validation and Clinical Applicability of Whole-Volume Automated Segmentation of Optical Coherence Tomography in Retinal Disease Using Deep Learning
JAMA Ophthalmology ( IF 7.8 ) Pub Date : 2021-09-01 , DOI: 10.1001/jamaophthalmol.2021.2273
Marc Wilson ₁ , Reena Chopra _{1,

2,

3} , Megan Z Wilson ₁ , Charlotte Cooper ₁ , Patricia MacWilliams ₁ , Yun Liu ₄ , Ellery Wulczyn ₄ , Daniela Florea _{2,

3} , Cían O Hughes ₁ , Alan Karthikesalingam ₁ , Hagar Khalid _{2,

3} , Sandra Vermeirsch _{2,

3} , Luke Nicholson _{2,

3} , Pearse A Keane _{2,

3} , Konstantinos Balaskas _{2,

3} , Christopher J Kelly ₁

Affiliation

Importance Quantitative volumetric measures of retinal disease in optical coherence tomography (OCT) scans are infeasible to perform owing to the time required for manual grading. Expert-level deep learning systems for automatic OCT segmentation have recently been developed. However, the potential clinical applicability of these systems is largely unknown.

Objective To evaluate a deep learning model for whole-volume segmentation of 4 clinically important pathological features and assess clinical applicability.

Design, Setting, Participants This diagnostic study used OCT data from 173 patients with a total of 15 558 B-scans, treated at Moorfields Eye Hospital. The data set included 2 common OCT devices and 2 macular conditions: wet age-related macular degeneration (107 scans) and diabetic macular edema (66 scans), covering the full range of severity, and from 3 points during treatment. Two expert graders performed pixel-level segmentations of intraretinal fluid, subretinal fluid, subretinal hyperreflective material, and pigment epithelial detachment, including all B-scans in each OCT volume, taking as long as 50 hours per scan. Quantitative evaluation of whole-volume model segmentations was performed. Qualitative evaluation of clinical applicability by 3 retinal experts was also conducted. Data were collected from June 1, 2012, to January 31, 2017, for set 1 and from January 1 to December 31, 2017, for set 2; graded between November 2018 and January 2020; and analyzed from February 2020 to November 2020.

Main Outcomes and Measures Rating and stack ranking for clinical applicability by retinal specialists, model-grader agreement for voxelwise segmentations, and total volume evaluated using Dice similarity coefficients, Bland-Altman plots, and intraclass correlation coefficients.

Results Among the 173 patients included in the analysis (92 [53%] women), qualitative assessment found that automated whole-volume segmentation ranked better than or comparable to at least 1 expert grader in 127 scans (73%; 95% CI, 66%-79%). A neutral or positive rating was given to 135 model segmentations (78%; 95% CI, 71%-84%) and 309 expert gradings (2 per scan) (89%; 95% CI, 86%-92%). The model was rated neutrally or positively in 86% to 92% of diabetic macular edema scans and 53% to 87% of age-related macular degeneration scans. Intraclass correlations ranged from 0.33 (95% CI, 0.08-0.96) to 0.96 (95% CI, 0.90-0.99). Dice similarity coefficients ranged from 0.43 (95% CI, 0.29-0.66) to 0.78 (95% CI, 0.57-0.85).

Conclusions and Relevance This deep learning–based segmentation tool provided clinically useful measures of retinal disease that would otherwise be infeasible to obtain. Qualitative evaluation was additionally important to reveal clinical applicability for both care management and research.

中文翻译：

使用深度学习对视网膜疾病光学相干断层扫描进行全体积自动分割的验证和临床适用性

重要性 由于手动分级所需的时间，在光学相干断层扫描 (OCT) 扫描中对视网膜疾病进行定量体积测量是不可行的。最近开发了用于自动 OCT 分割的专家级深度学习系统。然而，这些系统的潜在临床适用性在很大程度上是未知的。

目的评价深度学习模型对4个临床重要病理特征的全体积分割并评估临床适用性。

设计、设置、参与者 这项诊断研究使用了在 Moorfields 眼科医院接受治疗的 173 名患者的 OCT 数据，总共进行了 15 558 次 B 扫描。数据集包括 2 种常见的 OCT 设备和 2 种黄斑状况：湿性年龄相关性黄斑变性（107 次扫描）和糖尿病性黄斑水肿（66 次扫描），涵盖了所有严重程度，并从治疗期间的 3 个点开始。两名专家分级师对视网膜内液、视网膜下液、视网膜下高反射材料和色素上皮脱离进行像素级分割，包括每个 OCT 体积中的所有 B 扫描，每次扫描需要长达 50 小时。对全体积模型分割进行了定量评估。还由3名视网膜专家对临床适用性进行了定性评价。数据收集时间为 2012 年 6 月 1 日至 2017 年 1 月 31 日，第 1 组，2017 年 1 月 1 日至 12 月 31 日，第 2 组；在 2018 年 11 月至 2020 年 1 月之间分级；并从 2020 年 2 月到 2020 年 11 月进行了分析。

主要结果和措施 视网膜专家对临床适用性的评分和堆栈排名、体素分割的模型分级协议以及使用 Dice 相似系数、Bland-Altman 图和组内相关系数评估的总体积。

结果在纳入分析的 173 名患者（92 名 [53%] 女性）中，定性评估发现，在 127 次扫描中，自动全体积分割的排名优于或与至少 1 名专家分级员相当（73%；95% CI，66 %-79%）。对 135 个模型细分（78%；95% CI，71%-84%）和 309 个专家评分（每次扫描 2 个）（89%；95% CI，86%-92%）给予中性或正面评价。该模型在 86% 至 92% 的糖尿病黄斑水肿扫描和 53% 至 87% 的年龄相关性黄斑变性扫描中被评为中性或阳性。组内相关性范围从 0.33 (95% CI, 0.08-0.96) 到 0.96 (95% CI, 0.90-0.99)。骰子相似系数的范围为 0.43 (95% CI, 0.29-0.66) 到 0.78 (95% CI, 0.57-0.85)。

结论和相关性 这种基于深度学习的分割工具提供了临床上有用的视网膜疾病测量方法，否则这些方法是无法获得的。定性评估对于揭示护理管理和研究的临床适用性也很重要。

更新日期：2021-09-15

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文