Skip to main content
Log in

Machine learning methods are comparable to logistic regression techniques in predicting severe walking limitation following total knee arthroplasty

  • KNEE
  • Published:
Knee Surgery, Sports Traumatology, Arthroscopy Aims and scope

Abstract

Purpose

Machine-learning methods are flexible prediction algorithms with potential advantages over conventional regression. This study aimed to use machine learning methods to predict post-total knee arthroplasty (TKA) walking limitation, and to compare their performance with that of logistic regression.

Methods

From the department’s clinical registry, a cohort of 4026 patients who underwent elective, primary TKA between July 2013 and July 2017 was identified. Candidate predictors included demographics and preoperative clinical, psychosocial, and outcome measures. The primary outcome was severe walking limitation at 6 months post-TKA, defined as a maximum walk time ≤ 15 min. Eight common regression (logistic, penalized logistic, and ordinal logistic with natural splines) and ensemble machine learning (random forest, extreme gradient boosting, and SuperLearner) methods were implemented to predict the probability of severe walking limitation. Models were compared on discrimination and calibration metrics.

Results

At 6 months post-TKA, 13% of patients had severe walking limitation. Machine learning and logistic regression models performed moderately [mean area under the ROC curves (AUC) 0.73–0.75]. Overall, the ordinal logistic regression model performed best while the SuperLearner performed best among machine learning methods, with negligible differences between them (Brier score difference, < 0.001; 95% CI [− 0.0025, 0.002]).

Conclusions

When predicting post-TKA physical function, several machine learning methods did not outperform logistic regression—in particular, ordinal logistic regression that does not assume linearity in its predictors.

Level of evidence

Prognostic level II

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  2. Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78:1–3

    Article  Google Scholar 

  3. Cabitza F, Locoro A, Banfi G (2018) Machine learning in orthopedics: a literature review. Front Bioeng Biotechnol 6:75

    Article  Google Scholar 

  4. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794

  5. Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, van Calster B (2019) A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol 110:12–22

    Article  Google Scholar 

  6. Dowsey MM, Spelman T, Choong PF (2016) Development of a prognostic nomogram for predicting the probability of nonresponse to total knee arthroplasty 1 year after surgery. J Arthroplast 31:1654–1660

    Article  Google Scholar 

  7. Dunbar M, Robertsson O, Ryd L, Lidgren L (2001) Appropriate questionnaires for knee arthroplasty: results of a survey of 3600 patients from The Swedish Knee Arthroplasty Registry. J Bone Joint Surg Br 83:339–344

    Article  CAS  Google Scholar 

  8. Durrleman S, Simon R (1989) Flexible regression models with cubic splines. Stat Med 8:551–561

    Article  CAS  Google Scholar 

  9. Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15:3133–3181

    Google Scholar 

  10. Fontana MA, Lyman S, Sarker GK, Padgett DE, MacLean CH (2019) Can machine learning algorithms predict which patients will achieve minimally clinically important differences from total joint arthroplasty? Clin Orthop 477:1267–1279

    Article  Google Scholar 

  11. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232

    Article  Google Scholar 

  12. Goldstein BA, Navar AM, Carter RE (2017) Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges. Eur Heart J 38:1805–1814

    PubMed  Google Scholar 

  13. Greenwell B, Boehmke B, Gray B (2018) vip: variable importance plots. R package version 0.1.2. https://CRAN.R-project.org/package=vip. Accessed 10 Jan 2019

  14. Greenwell BM, Boehmke BC, McCarthy AJ (2018) A simple and effective model-based variable importance measure. arXiv preprint. arXiv:1805.04755

  15. Gunaratne R, Pratt DN, Banda J, Fick DP, Khan RJK, Robertson BW (2017) Patient dissatisfaction following total knee arthroplasty: a systematic review of the literature. J Arthroplast 32:3854–3860

    Article  Google Scholar 

  16. Gutacker N, Street A (2017) Use of large-scale HRQoL datasets to generate individualised predictions and inform patients about the likely benefit of surgery. Qual Life Res 26:2497–2505

    Article  Google Scholar 

  17. Harrell FE Jr (2015) Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. Springer, New York

    Book  Google Scholar 

  18. Harrell Jr FE (2019) rms: regression modeling strategies. R package version 5.1-3. http://CRAN.R-project.org/package=rms. Accessed 10 Jan 2019

  19. Harrell Jr FE, with contributions from Charles Dupont and many others (2019) Hmisc: Harrell Miscellaneous. R package version 4.2-0. https://CRAN.R-project.org/package=Hmisc. Accessed 10 Jan 2019

  20. Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:55–67

    Article  Google Scholar 

  21. Hubbard A, Kennedy C (2018) varimpact: variable importance estimation using targeted causal inference (TMLE). R package version 1.3.0-9004. http://github.com/ck37/varimpact. Accessed 10 Jan 2019

  22. Huber M, Kurz C, Leidl R (2019) Predicting patient-reported outcomes following hip and knee replacement surgery using supervised machine learning. BMC Med Inform Decis Mak 19:1–13

    Article  Google Scholar 

  23. Impellizzeri FM, Mannion AF, Leunig M, Bizzini M, Naal FD (2011) Comparison of the reliability, responsiveness, and construct validity of 4 different questionnaires for evaluating outcomes after total knee arthroplasty. J Arthroplast 26:861–869

    Article  Google Scholar 

  24. Jamshidi A, Pelletier JP, Martel-Pelletier J (2019) Machine-learning-based patient-specific prediction models for knee osteoarthritis. Nat Rev Rheumatol 15:49–60

    Article  Google Scholar 

  25. Kuhn M (2019) caret: classification and regression training. R package version 6.0-82. https://CRAN.R-project.org/package=caret. Accessed 10 Jan 2019

  26. Martimbianco ALC, Calabrese FR, Iha LAN, Petrilli M, Lira Neto O, Carneiro Filho M (2012) Reliability of the “American Knee Society Score”(AKSS). Acta Ortop Bras 20:34–38

    Article  Google Scholar 

  27. Ogutu JO, Piepho HP, Schulz-Streeck T (2011) A comparison of random forests, boosting and support vector machines for genomic selection. BMC Proc 5(Suppl 3):1–5

    Article  Google Scholar 

  28. Pirracchio R, Petersen ML, Carone M, Rigon MR, Chevret S, van der Laan MJ (2015) Mortality prediction in intensive care units with the Super ICU Learner Algorithm (SICULA): a population-based study. Lancet Respir Med 3:42–52

    Article  Google Scholar 

  29. Polley E, LeDell E, Kennedy C, van der Laan M (2018) SuperLearner: super learner prediction. R package version 2.0-24. https://CRAN.R-project.org/package=SuperLearner

  30. Pua YH, Poon CL, Seah FJ, Thumboo J, Clark RA, Tan MH et al (2019) Predicting individual knee range of motion, knee pain, and walking limitation outcomes following total knee arthroplasty. Acta Orthop 90:179–186

    Article  Google Scholar 

  31. R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. Accessed 10 Jan 2019

  32. Roozenbeek B, Lingsma HF, Perel P, Edwards P, Roberts I, Murray GD et al (2011) The added value of ordinal analysis in clinical trials: an example in traumatic brain injury. Crit Care 15:1–7

    Article  Google Scholar 

  33. Rose S (2013) Mortality risk score prediction in an elderly population using machine learning. Am J Epidemiol 177:443–452

    Article  Google Scholar 

  34. Sanchez-Santos MT, Garriga C, Judge A, Batra RN, Price AJ, Liddle AD et al (2018) Development and validation of a clinical prediction model for patient-reported pain and function after primary total knee replacement surgery. Sci Rep 8:1–9

    Article  CAS  Google Scholar 

  35. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol 6:267–288

    Google Scholar 

  36. Van der Laan MJ, Polley EC, Hubbard AE (2007) Super learner. Stat Appl Genet Mol Biol 6:1–21

    Google Scholar 

  37. van der Ploeg T, Austin PC, Steyerberg EW (2014) Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med Res Methodol 14:1–13

    Article  Google Scholar 

  38. Van Onsem S, Van Der Straeten C, Arnout N, Deprez P, Van Damme G, Victor J (2016) A new prediction model for patient satisfaction after total knee arthroplasty. J Arthroplast 31:2660–2667

    Article  Google Scholar 

  39. van Os HJA, Ramos LA, Hilbert A, van Leeuwen M, van Walderveen MAA, Kruyt ND et al (2018) Predicting outcome of endovascular treatment for acute ischemic stroke: potential value of machine learning algorithms. Front Neurol 9:1–8

    Article  Google Scholar 

  40. Wainberg M, Merico D, Delong A, Frey BJ (2018) Deep learning in biomedicine. Nat Biotechnol 36:829–838

    Article  CAS  Google Scholar 

  41. Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–259

    Article  Google Scholar 

  42. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol 67:301–320

    Article  Google Scholar 

Download references

Acknowledgements

We thank Brandon Greenwell for his generous help with the vip R package and Michael W. Wade at Vanderbilt University Medical Center for his editorial work on this article. We acknowledge the support from Jennifer Liaw, the head of the Department of Physiotherapy, Singapore General Hospital. We thank William Yeo from the Orthopaedic Diagnostic Centre, Singapore General Hospital, for his assistance. Finally, we thank Ee-Lin Woon, Felicia Jie-Ting Seah, Nai-Hong Chan, and the therapy assistants (Penny Teh and Hamidah Binti Hanib) for their kind assistance.

Funding

No funding was provided for the completion of this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong-Hao Pua.

Ethics declarations

Conflict of interest

The authors have no professional or financial affiliations that may be perceived to have biased the presentation. Each author certifies that he or she has no commercial associations that might pose a conflict of interest in connection with the submitted article.

Ethical approval

Ethical approval was provided by the SingHealth Centralized IRB (SingHealth CIRB 2014/2027, Singapore).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 84 kb)

Supplementary material 2 (DOCX 18 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pua, YH., Kang, H., Thumboo, J. et al. Machine learning methods are comparable to logistic regression techniques in predicting severe walking limitation following total knee arthroplasty. Knee Surg Sports Traumatol Arthrosc 28, 3207–3216 (2020). https://doi.org/10.1007/s00167-019-05822-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00167-019-05822-7

Keywords

Navigation