当前位置: X-MOL 学术Methods Inf. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Analysis of Feature Extraction Methods for Prediction of 30-Day Hospital Readmissions.
Methods of Information in Medicine ( IF 1.3 ) Pub Date : 2020-04-29 , DOI: 10.1055/s-0040-1702159
Joel Sumner 1 , Adel Alaeddini 1
Affiliation  

Abstract

Objectives This article aims to determine possible improvements made by feature extraction methods to the machine learning prediction methods for predicting 30-day hospital readmissions.

Methods The study evaluates five feature extraction methods including principal component analysis (PCA), kernel principal component analysis (KPCA), isomap, Laplacian eigenmaps, and locality preserving projections (LPPs) for improving the accuracy of nine machine learning prediction methods in predicting 30-day hospital readmissions. The specific prediction methods considered include logistic regression, Cox regression, linear discriminant analysis, k-nearest neighbor (KNN), support vector machines (SVMs), bagged trees, boosted trees, random forest, and artificial neural networks. All models are developed in MATLAB and validated using area under the curve based on two population-based data sets from partner hospitals.

Results Laplacian eigenmaps and isomap feature extraction provide the most improvement to the readmission predictive accuracy of KNN, SVM, bagged trees, boosted trees, and linear discriminant analysis methods. The results for artificial neural networks, random forest, Cox regression, and logistic regression show improvement for only one of the data sets. Also, PCA and LPP provided the best computation efficiency followed by KPCA, Laplacian eigenmaps, and isomap.

Conclusion Feature extraction methods can improve the predictive performance of machine learning methods for predicting readmissions. However, the improvement depended on the specific choice of the prediction method, feature extraction method, and the complexity of the data set features.



中文翻译:

预测30天住院率的特征提取方法分析。

摘要

目标 本文旨在确定特征提取方法对机器学习预测方法(用于预测30天医院再入院)的可能改进。

方法 研究评估了五种特征提取方法,包括主成分分析(PCA),核主成分分析(KPCA),isomap,Laplacian特征图和局部性保留投影(LPP),以提高九种机器学习预测方法在预测30-日间医院入院。考虑的特定预测方法包括逻辑回归,Cox回归,线性判别分析,k最近邻(KNN),支持向量机(SVM),袋装树,增强树,随机森林和人工神经网络。所有模型均在MATLAB中开发,并根据合作医院的两个基于人口的数据集使用曲线下的面积进行了验证。

结果 拉普拉斯特征图和isomap特征提取为KNN,SVM,袋装树,增强树和线性判别分析方法的再入院预测准确性提供了最大的改进。人工神经网络,随机森林,Cox回归和Logistic回归的结果仅显示了其中一组数据的改进。同样,PCA和LPP提供了最佳的计算效率,其次是KPCA,拉普拉斯特征图和isomap。

结论 特征提取方法可以提高机器学习方法对重新录取率的预测性能。但是,改进取决于预测方法,特征提取方法的特定选择以及数据集特征的复杂性。

更新日期:2020-04-29
down
wechat
bug