当前位置: X-MOL 学术EBioMedicine › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A practical model for the identification of congenital cataracts using machine learning.
EBioMedicine ( IF 9.7 ) Pub Date : 2020-01-02 , DOI: 10.1016/j.ebiom.2019.102621
Duoru Lin 1 , Jingjing Chen 1 , Zhuoling Lin 1 , Xiaoyan Li 1 , Kai Zhang 2 , Xiaohang Wu 1 , Zhenzhen Liu 1 , Jialing Huang 3 , Jing Li 1 , Yi Zhu 4 , Chuan Chen 4 , Lanqin Zhao 1 , Yifan Xiang 1 , Chong Guo 1 , Liming Wang 5 , Yizhi Liu 1 , Weirong Chen 1 , Haotian Lin 1
Affiliation  

BACKGROUND Approximately 1 in 33 newborns is affected by congenital anomalies worldwide. We aimed to develop a practical model for identifying infants with a high risk of congenital cataracts (CCs), which is the leading cause of avoidable childhood blindness. METHODS This case-control study was performed in the Zhongshan Ophthalmic Center and involved 2005 subjects, including 1274 children with CCs and 731 healthy controls. The CC identification models were established based on birth conditions, family medical history, and family environmental factors using the random forest (RF) and adaptive boosting methods (trained by 1129 CC cases and 609 healthy controls), which were tested by internal 4-fold cross-validation and external validation (145 CC cases and 122 healthy controls). The models were also tested using 4 datasets with gradually reduced proportions of CC patients (bilateral cases) to validate their performance in an approximate simulation of a clinical environment with a relatively low disease prevalence. FINDINGS The CC identification models showed high discrimination in both the 4-fold cross validation (area under the curve (AUC)=0.91 [95% confidence interval: 0.88-0.94] in bilateral cases; 0.82 [0.77-0.89] in unilateral cases) and external validation (AUC=0.93±0.05 in bilateral cases; 0.86±0.01 in unilateral cases), and achieved stable performance in the clinical tests (AUC=0.94-0.96 in the four subgroups by RF). Furthermore, family history of CC, low parental education level, and comorbidity were identified as the top three most relevant factors to both bilateral and unilateral CC diagnosis. INTERPRETATION Our CC identification models can accurately discriminate CC patients from healthy children and have the potential to serve as a complementary screening procedure, especially in undeveloped and remote areas.

中文翻译:

使用机器学习识别先天性白内障的实用模型。

背景技术全球范围内,约有三分之一的新生儿受到先天性畸形的影响。我们旨在开发一种实用的模型,以识别具有先天性白内障(CC​​s)高风险的婴儿,这是可避免的儿童失明的主要原因。方法该病例对照研究在中山眼科中心进行,涉及2005年的受试者,包括1274名CC儿童和731名健康对照。根据出生条件,家庭病史和家庭环境因素,使用随机森林(RF)和自适应加强方法(通过1129例CC病例和609名健康对照者进行训练)建立了CC识别模型,并通过内部4倍检验交叉验证和外部验证(145个CC病例和122个健康对照)。还使用4个CC患者(双边病例)比例逐渐降低的数据集对模型进行了测试,以验证其在疾病发生率相对较低的临床环境的近似模拟中的性能。研究结果CC识别模型在4倍交叉验证中均显示出很高的区分度(双侧病例的曲线下面积(AUC)= 0.91 [95%置信区间:0.88-0.94];单侧病例的0.82 [0.77-0.89])外部验证(双侧病例为AUC = 0.93±0.05;单侧病例为0.86±0.01),并且在临床测试中表现稳定(RF四个亚组的AUC = 0.94-0.96)。此外,CC的家族史,低父母教育水平和合并症被认为是双边和单方面CC诊断的三大最相关因素。
更新日期:2020-01-04
down
wechat
bug