Review article
Comparison of machine learning and logistic regression models in predicting acute kidney injury: A systematic review and meta-analysis

https://doi.org/10.1016/j.ijmedinf.2021.104484Get rights and content
Under a Creative Commons license
open access

Highlights

  • Logistic regression has been used to predict acute kidney injury in critical care settings.

  • Interest in machine learning algorithms to predict acute kidney injury has grown; however, optimal algorithms and predictor variables are poorly understood.

  • Machine learning performance to predict acute kidney injury is variable and depends on the predictor variables included in the model as well as the type of algorithm deployed.

  • While common biomarkers (creatinine, BUN) for acute kidney injury were important for model performance, there are a large number of predictor variables that have been identified and are currently being used in machine learning models.

  • Machine learning performance is comparable to conventional logistic regression models in predicting acute kidney injury.

Abstract

Introduction

We aimed to assess whether machine learning models are superior at predicting acute kidney injury (AKI) compared to logistic regression (LR), a conventional prediction model.

Methods

Eligible studies were identified using PubMed and Embase. A total of 24 studies consisting of 84 prediction models met inclusion criteria. Independent samples t-test was performed to detect mean differences in area under the curve (AUC) between ML and LR models. One-way ANOVA and post-hoc t-tests were performed to assess mean differences in AUC between ML methods.

Results

AUC data were similar between ML (0.736 ± 0.116) and LR (0.748 ± 0.057) models (p = 0.538). However, specific ML models, such as gradient boosting (0.838 ± 0.077), exhibited superior performance at predicting AKI as compared to other ML models in the literature (p < 0.05). Creatinine and urine output, standard variables assessed for AKI staging, were classified as significant predictors across multiple ML models, although the majority of significant predictors were unique and study specific.

Conclusions

These data suggest that ML models perform equally to that of LR, however ML models exhibit variable performance with some ML models displaying exceptional performance. The variability in ML prediction of AKI can be attributed, in part, to the specific ML model utilized, variable selection and processing, study and subject characteristics, and the steps associated with model training, validation, testing, and calibration.

Abbreviations

AKI
acute kidney injury
GFR
glomerular filtration rate
RIFLE
risk, injury, failure, loss of kidney function, and end-stage kidney disease
AKIN
acute kidney injury network
KDIGO
kidney disease: improving global outcomes
CKD
chronic kidney disease
ML
machine learning
LR
logistic regression
HER
electronic health records
AUC
area under the curve
SD
standard deviation
BUN
blood urea nitrogen

Keywords

Acute kidney injury
Machine learning
Artificial intelligence
Logistic regression

Cited by (0)