当前位置: X-MOL 学术Br. J. Cancer › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Development of an AI system for accurately diagnose hepatocellular carcinoma from computed tomography imaging data
British Journal of Cancer ( IF 8.8 ) Pub Date : 2021-08-07 , DOI: 10.1038/s41416-021-01511-w
Meiyun Wang 1 , Fangfang Fu 1 , Bingjie Zheng 2 , Yan Bai 1 , Qingxia Wu 1 , Jianqiang Wu 3 , Lin Sun 4 , Qiuyu Liu 5 , Mingge Liu 6 , Yichen Yang 7 , Hongru Shen 7 , Dalu Kong 8 , Xiaoyue Ma 9 , Peiting You 10 , Xiangchun Li 7 , Fei Tian 8
Affiliation  

Background and aims

Computed tomography (CT) scan is frequently used to detect hepatocellular carcinoma (HCC) in routine clinical practice. The aim of this study is to develop a deep-learning AI system to improve the diagnostic accuracy of HCC by analysing liver CT imaging data.

Methods

We developed a deep-learning AI system by training on CT images from 7512 patients at Henan Provincial Peoples’ Hospital. Its performance was validated on one internal test set (Henan Provincial Peoples’ Hospital, n = 385) and one external test set (Henan Provincial Cancer Hospital, n = 556). The area under the receiver-operating characteristic curve (AUROC) was used as the primary classification metric. Accuracy, sensitivity, specificity, precision, negative predictive value and F1 metric were used to measure the performance of AI systems and radiologists.

Results

AI system achieved high performance in identifying HCC patients, with AUROC of 0.887 (95% CI 0.855–0.919) on the internal test set and 0.883 (95% CI 0.855–0.911) on the external test set. For internal test set, accuracy was 81.0% (76.8–84.8%), sensitivity was 78.4% (72.4–83.7%), specificity was 84.4% (78.0–89.6%) and F1 (harmonic average of precision and recall rate) was 0.824. For external test set, accuracy was 81.3% (77.8–84.5%), sensitivity was 89.4% (85.0–92.8%), specificity was 74.0% (68.5–78.9%) and F1 was 0.819. Compared with radiologists, AI system achieved comparable accuracy and F1 metric on internal test set (0.853 versus 0.818, P = 0.107; 0.863 vs. 0.824, P = 0.082) and external test set (0.805 vs. 0.793, P = 0.663; 0.810 vs. 0.814, P = 0.866). The predicted HCC risk scores by AI system in HCC patients with multiple tumours and high fibrosis stage were higher than those with solitary tumour and low fibrosis stage (tumour number: 0.197 vs. 0.138, P = 0.006; fibrosis stage: 0.183 vs. 0.127, P < 0.001). Radiologists’ review showed that the accuracy of saliency heatmaps predicted by algorithms was 92.1% (95% CI: 89.2–95.0%).

Conclusions

AI system achieved high performance in the detection of HCC compared with a group of specialised radiologists. Further investigation by prospective clinical trials was necessitated to verify this model.



中文翻译:

基于计算机断层扫描成像数据准确诊断肝细胞癌的人工智能系统的开发

背景和目标

在常规临床实践中,计算机断层扫描 (CT) 扫描经常用于检测肝细胞癌 (HCC)。本研究的目的是开发一种深度学习人工智能系统,通过分析肝脏 CT 成像数据来提高 HCC 的诊断准确性。

方法

我们通过对河南省人民医院 7512 名患者的 CT 图像进行训练,开发了一个深度学习 AI 系统。其性能在一套内部测试集(河南省人民医院,n  = 385)和一套外部测试集(河南省肿瘤医院,n  = 556)上得到验证。接受者操作特征曲线下面积(AUROC)用作主要分类指标。准确度、灵敏度、特异性、精密度、阴性预测值和 F1 指标用于衡量 AI 系统和放射科医师的表现。

结果

AI 系统在识别 HCC 患者方面取得了很高的性能,内部测试集的 AUROC 为 0.887(95% CI 0.855-0.919),外部测试集的 AUROC 为 0.883(95% CI 0.855-0.911)。对于内部测试集,准确率为 81.0% (76.8–84.8%),灵敏度为 78.4% (72.4–83.7%),特异性为 84.4% (78.0–89.6%),F1(精确率和召回率的调和平均值)为 0.824 . 对于外部测试集,准确度为 81.3% (77.8–84.5%),灵敏度为 89.4% (85.0–92.8%),特异性为 74.0% (68.5–78.9%),F1 为 0.819。与放射科医生相比,AI 系统在内部测试集(0.853 对 0.818,P  = 0.107;0.863 对 0.824,P  = 0.082)和外部测试集(0.805 对 0.793,P  = 0.663;0.810 对. 0.814,P  = 0.866)。AI系统对多发肿瘤和高纤维化HCC患者的预测HCC风险评分高于孤立肿瘤和低纤维化阶段的患者(肿瘤数:0.197 vs. 0.138,P  = 0.006;纤维化阶段:0.183 vs. 0.127,P  < 0.001)。放射科医生的审查表明,算法预测的显着性热图的准确性为 92.1%(95% CI:89.2-95.0%)。

结论

与一组专业的放射科医师相比,AI 系统在检测 HCC 方面取得了较高的性能。有必要通过前瞻性临床试验进行进一步调查以验证该模型。

更新日期:2021-08-09
down
wechat
bug