当前位置: X-MOL 学术J. Biomed. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction of mortality in Intensive Care Units: a multivariate feature selection.
Journal of Biomedical informatics ( IF 4.5 ) Pub Date : 2020-05-23 , DOI: 10.1016/j.jbi.2020.103456
Flávio Monteiro 1 , Fernando Meloni 2 , José Augusto Baranauskas 1 , Alessandra Alaniz Macedo 1
Affiliation  

Context

The critical nature of patients in Intensive Care Units (ICUs) demands intensive monitoring of their vital signs as well as highly qualified professional assistance. The combination of these needs makes ICUs very expensive, which requires investment to be prioritized. Administrative issues emerge, and health institutions face dilemmas such as: “How many beds should an ICU provide to serve the population, at the lowest costs” and “Which is the most critical body information to monitor in an ICU?”. Due to financial and ethical implications, these judgments require technical and precise knowledge. Decisions have usually relied on clinical scores, like the APACHE (Acute Physiology And Chronic Health Evaluation) and SOFA (Sequential Organ Failure Assessment) scores, which are imprecise and outdated. The popularization of machine learning techniques has shed some light on the topic as a way to renew score purposes. In 2012, the PhysioNet/Computing in Cardiology launched the Challenge – ICU Patients. This Challenge aimed to stimulate the development of techniques to predict mortality in ICUs. Based on biometric and physiological features collected from patients, the participants predicted the patient’s death risk by using their classifiers. Several participants achieved results that were better than the results produced by the SOFA and the APACHE scores; the prediction levels were 54%, which is weak.

Objectives

Here, we investigate the reasons that led to these results as a means to ground our solution. Then, we propose alternative practices in an attempt to improve the results. Our main goal is to improve the prediction of mortality in ICUs by using the same data employed during the 2012 PhysioNet Challenge. Our specific objectives are (i) to simplify the problem by reducing the dimensionality; (ii) to reduce the uncontrolled variance, and (iii) to make classifiers less dependent on the training set.

Methods

Accordingly, we propose a methodology based on extensive steps, including sample filter and data normalization. To select features and to reduce the intra-group variance, we employ multivariate data analysis by using Principal Component Analysis, Factor Analysis, Spectral Clustering, and Tukey’s HSD Test, recursively. After that, we use machine learning techniques to create classifiers according to different methods. We evaluate our results with the same metrics proposed by the 2012 PhysioNet Challenge.

Results

For classifiers constructed and tested by using independent datasets, our best classifier was a linear SVM, which provided results of 0.73. These results were significantly better than the 0.54 achieved in previous work at >99% confidence interval. Furthermore, our approach only demanded twelve features, which was consistently smaller than the number of features required by the previous approaches.

Conclusion

Our results indicated that our approach presented: (a) higher performance to predict death risks (+20%); (b) smaller dependence on the training set; and (c) lower costs for ICU monitoring (few features). Besides the better prediction power, our approach also demanded lower costs for implementation and a more extensive range of potential ICUs. Future studies should employ our proposal to investigate the possibility of including some physiological features that were not available for the 2012 PhysioNet Challenge.



中文翻译:

重症监护病房死亡率预测:多元特征选择。

语境

重症监护病房(ICU)患者的关键性质要求对其生命体征进行严格监控,并要求高素质的专业协助。这些需求的结合使ICU非常昂贵,这需要优先考虑投资。行政问题浮出水面,卫生机构面临两难困境,例如:“重症监护病房应以最低的成本提供多少张病床以服务于民众”,以及“在重症监护病房中监测的最关键的人体信息是什么?”。由于财务和道德影响,这些判断需要技术和精确的知识。决策通常依赖于临床评分,例如APACHE(急性生理和慢性健康评估)和SOFA(顺序器官衰竭评估)评分,这些评分不准确且过时。机器学习技术的普及为该主题更新了分数目的提供了一些启示。在2012年,PhysioNet /心脏病学计算部门发起了挑战-ICU患者。这项挑战旨在刺激预测ICU死亡率的技术的发展。根据从患者身上收集的生物特征和生理特征,参与者使用分类器预测了患者的死亡风险。有几位参与者取得了比SOFA和APACHE分数更好的结果;预测水平是54,这很弱。

目标

在这里,我们调查导致这些结果的原因,以此作为解决方案的基础。然后,我们提出替代方法,以试图改善结果。我们的主要目标是通过使用ICU期间使用的相同数据来改善ICU死亡率的预测。2012年PhysioNet挑战。我们的具体目标是(i)通过减少尺寸来简化问题;(ii)减少不可控制的方差,以及(iii)减少分类器对训练集的依赖。

方法

因此,我们提出了一种基于广泛步骤的方法,包括样本过滤器和数据归一化。为了选择特征并减少组内方差,我们通过主成分分析,因子分析,谱聚类和Tukey的HSD检验递归采用多元数据分析。之后,我们使用机器学习技术根据不同的方法创建分类器。我们使用与2012年 PhysioNet挑战。

结果

对于使用独立数据集构建和测试的分类器,我们最好的分类器是线性SVM,可提供以下结果: 0.73。这些结果明显优于0.54 在以前的工作中取得的成就 >99置信区间。此外,我们的方法只需要十二个特征,这比以前的方法所需的特征数要少。

结论

我们的结果表明,我们的方法提出了:(a)更高的预测死亡风险的性能(+20); (b)对训练集的依赖性较小;(c)降低ICU监控成本(功能少)。除了更好的预测能力外,我们的方法还要求更低的实施成本和更广泛的潜在ICU。未来的研究应采用我们的建议,以调查包括某些生理特征的可能性,而这些生理特征对于2012年 PhysioNet挑战。

更新日期:2020-05-23
down
wechat
bug