当前位置: X-MOL 学术Sci. Rep. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An efficient machine learning-based approach for screening individuals at risk of hereditary haemochromatosis
Scientific Reports ( IF 4.6 ) Pub Date : 2020-11-26 , DOI: 10.1038/s41598-020-77367-6
Patricia Martins Conde , Thomas Sauter , Thanh-Phuong Nguyen

Hereditary haemochromatosis (HH) is an autosomal recessive disease, where HFE C282Y homozygosity accounts for 80–85% of clinical cases among the Caucasian population. HH is characterised by the accumulation of iron, which, if untreated, can lead to the development of liver cirrhosis and liver cancer. Since iron overload is preventable and treatable if diagnosed early, high-risk individuals can be identified through effective screening employing artificial intelligence-based approaches. However, such tools expose novel challenges associated with the handling and integration of large heterogeneous datasets. We have developed an efficient computational model to screen individuals for HH using the family study data of the Hemochromatosis and Iron Overload Screening (HEIRS) cohort. This dataset, consisting of 254 cases and 701 controls, contains variables extracted from questionnaires and laboratory blood tests. The final model was trained on an extreme gradient boosting classifier using the most relevant risk factors: HFE C282Y homozygosity, age, mean corpuscular volume, iron level, serum ferritin level, transferrin saturation, and unsaturated iron-binding capacity. Hyperparameter optimisation was carried out with multiple runs, resulting in 0.94 ± 0.02 area under the receiving operating characteristic curve (AUCROC) for tenfold stratified cross-validation, demonstrating its outperformance when compared to the iron overload screening (IRON) tool.



中文翻译:

一种有效的基于机器学习的方法,用于筛查有遗传性血色素沉着病风险的个体

遗传性血色素沉着病(HH)是常染色体隐性遗传疾病,其中HFE在白种人人群中,C282Y纯合子占临床病例的80–85%。HH的特征是铁的积累,如果不加以处理,会导致肝硬化和肝癌的发展。由于铁超负荷是可以预防的,而且如果可以及早诊断,可以治疗,因此可以通过基于人工智能的方法进行有效筛查来识别高危人群。但是,这样的工具暴露了与大型异构数据集的处理和集成相关的新挑战。我们已经开发了一种有效的计算模型,可以使用血色素沉着病和铁超负荷筛查(HEIRS)队列的家庭研究数据筛查HH。该数据集由254个病例和701个对照组成,包含从问卷和实验室血液测试中提取的变量。HFE C282Y纯合性,年龄,平均红细胞体积,铁水平,血清铁蛋白水平,转铁蛋白饱和度和不饱和铁结合能力。通过多次运行进行超参数优化,结果在十倍分层交叉验证的接收工作特性曲线(AUCROC)下为0.94±0.02面积,与铁过载筛选(IRON)工具相比,它的性能更高。

更新日期:2020-11-27
down
wechat
bug