Skip to main content
Log in

Machine Learning Approaches for Fracture Risk Assessment: A Comparative Analysis of Genomic and Phenotypic Data in 5130 Older Men

  • Original Research
  • Published:
Calcified Tissue International Aims and scope Submit manuscript

Abstract

The study aims were to develop fracture prediction models by using machine learning approaches and genomic data, as well as to identify the best modeling approach for fracture prediction. The genomic data of Osteoporotic Fractures in Men, cohort Study (n = 5130), were analyzed. After a comprehensive genotype imputation, genetic risk score (GRS) was calculated from 1103 associated Single Nucleotide Polymorphisms for each participant. Data were normalized and split into a training set (80%) and a validation set (20%) for analysis. Random forest, gradient boosting, neural network, and logistic regression were used to develop prediction models for major osteoporotic fractures separately, with GRS, bone density, and other risk factors as predictors. In model training, the synthetic minority oversampling technique was used to account for low fracture rate, and tenfold cross-validation was employed for hyperparameters optimization. In the testing, the area under curve (AUC) and accuracy were used to assess the model performance. The McNemar test was employed to examine the accuracy difference between models. The results showed that the prediction performance of gradient boosting was the best, with AUC of 0.71 and an accuracy of 0.88, and the GRS ranked as the 7th most important variable in the model. The performance of random forest and neural network were also significantly better than that of logistic regression. This study suggested that improving fracture prediction in older men can be achieved by incorporating genetic profiling and by utilizing the gradient boosting approach. This result should not be extrapolated to women or young individuals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Abbreviations

MrOS:

Osteoporotic fractures in men study

ML:

Machine learning

BMD:

Bone mineral density

FRAX:

The fracture risk assessment tool

GRS:

Genetic risk score

QUS:

Quantitative ultrasound

ROC:

Receiver-operating curve

AUC:

Area under curve

LR:

Logistic regression

RF:

Random forest

GB:

Gradient boosting

NN:

Neural network

MOF:

Major osteoporotic fracture

SNPs:

Single nucleotide polymorphisms

FNBMD:

Femoral neck BMD

TSBMD:

Total spine BMD

THBMD:

Total hip BMD

SOS:

Speed of sound

BUA:

Broadband ultrasonic attenuation

QUI:

Quantitative ultrasonic index

References

  1. Johnell O, Kanis JA (2006) An estimate of the worldwide prevalence and disability associated with osteoporotic fractures. Osteoporos Int 17(12):1726–1733

    CAS  PubMed  Google Scholar 

  2. Melton LJ, Cooper C (2007) Chapter 21—Magnitude and impact of osteoporosis and fractures osteoporosis., 2nd edn, Academic Press Inc, San Diego, pp 557–567

  3. Boonen S et al (2012) Fracture risk and zoledronic acid therapy in men with osteoporosis. N Engl J Med 367(18):1714–1723

    CAS  PubMed  Google Scholar 

  4. Jiang HX et al (2005) Development and initial validation of a risk score for predicting in-hospital and 1-year mortality in patients with hip fractures. J Bone Miner Res 20(3):494–500

    CAS  PubMed  Google Scholar 

  5. Papaioannou A et al (2009) Risk factors for low BMD in healthy men age 50 years or older: a systematic review. Osteoporos Int 20(4):507–518

    CAS  PubMed  Google Scholar 

  6. Kanis JA, Johnell O, Oden A, Johansson H, McCloskey E (2008) FRAXTM and the assessment of fracture probability in men and women from the UK. Osteoporos Int 19:385–397

    CAS  PubMed  PubMed Central  Google Scholar 

  7. McCloskey EV, Johansson H, Oden A, Kanis JA (2009) From relative risk to absolute fracture risk calculation: the FRAX algorithm. Curr Osteoporos Rep 7(3):77–83

    PubMed  Google Scholar 

  8. Morris JA et al (2019) An atlas of genetic influences on osteoporosis in humans and mice. Nat Genet 51(2):258–266

    CAS  PubMed  Google Scholar 

  9. Ralston SH, Uitterlinden AG (2010) Genetics of osteoporosis. Endocr Rev 31(5):629–662

    CAS  PubMed  Google Scholar 

  10. Hsu YH et al (2010) An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility loci for osteoporosis-related traits. PLoS Genet 6(6):1–16

    Google Scholar 

  11. Kim SK (2018) Identification of 613 new loci associated with heel bone mineral density and a polygenic risk score for bone mineral density, osteoporosis and fracture. PLoS ONE 13(7):e0200785

    PubMed  PubMed Central  Google Scholar 

  12. Hsieh CH, Lu RH, Lee NH, Chiu WT, Hsu MH, Li YC (2011) Novel solutions for an old disease: diagnosis of acute appendicitis with random forest, support vector machines, and artificial neural networks. Surgery 149(1):87–93

    PubMed  Google Scholar 

  13. Orwoll E et al (2005) Design and baseline characteristics of the osteoporotic fractures in men (MrOS) study—A large observational study of the determinants of fracture in older men. Contemp Clin Trials 26:569–585

    PubMed  Google Scholar 

  14. Blank JB et al (2005) Overview of recruitment for the osteoporotic fractures in men study (MrOS). Contemp Clin Trials 26(5):557–568

    PubMed  Google Scholar 

  15. Cauley JA et al (2005) Factors associated with the lumbar spine and proximal femur bone mineral density in older men. Osteoporos Int 16(12):1525–1537

    PubMed  Google Scholar 

  16. Bauer DC, Ewing SK, Cauley JA, Ensrud KE, Cummings SR, Orwoll ES (2007) Quantitative ultrasound predicts hip and non-spine fracture in men: the MrOS study. Osteoporos Int 18(6):771–777

    CAS  PubMed  Google Scholar 

  17. Lix LM, Leslie WD, Majumdar SR (2018) Measuring improvement in fracture risk prediction for a new risk factor: a simulation. BMC Res Notes 11:62

    PubMed  PubMed Central  Google Scholar 

  18. Andrews NA (2010) Genome-wide association studies in the osteoporosis field: Impressive technological achievements, but an uncertain future in the clinical setting. IBMS BoneKEy 7(11):382–387

    Google Scholar 

  19. Melton LJ, Atkinson EJ, O’Fallon WM, Wahner HW, Riggs BL (1993) Long-term fracture prediction by bone mineral assessed at different skeletal sites. J Bone Miner Res 8(10):1227–1233

    PubMed  Google Scholar 

  20. Kanis JA et al (2005) Assessment of fracture risk. Osteoporos Int 16(6):581–589

    PubMed  Google Scholar 

  21. Stone KL et al (2003) BMD at multiple sites and risk of fracture of multiple types: long-term results from the study of osteoporotic fractures. J Bone Miner Res 18(9):1947–1954

    PubMed  Google Scholar 

  22. Iniesta R, Stahl D, McGuffin P (2016) Machine learning, statistical learning and the future of biological research in psychiatry. Psychol Med 46(12):2455–2465

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Sun Y, Kamel MS, Wong AKC, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn 40(12):3358–3378

    Google Scholar 

  24. Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Handling imbalanced datasets : a review. GESTS Int Trans Comput Sci Eng 30(1):25–36

    Google Scholar 

  25. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique nitesh. J Artif Intell Res 16(1):321–357

    Google Scholar 

  26. Raschka S (2018) Model evaluation , model selection , and algorithm selection in machine learning. CoRR abs/1811.12808.

  27. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011) Scikit-learn: machine learning in Python. J. Mach Learn Res 12:2825–2830

    Google Scholar 

  28. Bolland MJ et al (2011) Evaluation of the FRAX and Garvan fracture risk calculators in older women. J Bone Miner Res 26(2):420–427

    PubMed  Google Scholar 

  29. Al-Barghouthi BM, Farber CR (2019) Dissecting the genetics of osteoporosis using systems approaches. Trends Genet 35(1):55–67

    CAS  PubMed  Google Scholar 

  30. Eriksson J et al (2015) Limited clinical utility of a genetic risk score for the prediction of fracture risk in elderly subjects. J Bone Miner Res 30(1):184–194

    PubMed  Google Scholar 

  31. Ho-Le TP, Center JR, Eisman JA, Nguyen HT, Nguyen TV (2017) Prediction of bone mineral density and fragility fracture by genetic profiling. J Bone Miner Res 32(2):285–293

    CAS  PubMed  Google Scholar 

  32. Estrada K et al (2012) Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture. Nat Genet 44(5):491–501

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Taylor RA, Moore CL, Cheung KH, Brandt C (2018) Predicting urinary tract infections in the emergency department with machine learning. PLoS ONE 13(3):1–15

    Google Scholar 

  34. Kruse C, Eiken P, Vestergaard P (2017) Machine learning principles can improve hip fracture prediction. Calcif Tissue Int 100(4):348–360

    CAS  PubMed  Google Scholar 

  35. Sato M et al (2019) Machine-learning approach for the development of a novel predictive model for the diagnosis of hepatocellular carcinoma. Sci Rep 9(1):1–7

    Google Scholar 

  36. Chiew CJ, Liu N, Tagami T, Wong TH, Koh ZX, Ong MEH (2019) Heart rate variability based machine learning models for risk prediction of suspected sepsis patients in the emergency department. Medicine 98(6):e14197

    PubMed  PubMed Central  Google Scholar 

  37. Babajide Mustapha I, Saeed F (2016) Bioactive molecule prediction using extreme gradient boosting. Molecules (Basel, Switzerland) 21(8):1–11

    Google Scholar 

  38. Cummings SR et al (1993) Bone density at various sites for prediction of hip fractures. The Lancet 341(8837):72–75

    CAS  Google Scholar 

  39. Beleites C, Neugebauer U, Bocklitz T, Krafft C, Popp J (2015) Sample size planning for classification models. Anal Chim Acta 760:25–33

    Google Scholar 

  40. Nguyen TV, Eisman JA (2013) Genetic profiling and individualized assessment of fracture risk. Nat Rev Endocrinol 9(3):153–161

    PubMed  Google Scholar 

Download references

Acknowledgements

The data/analyses presented in the current publication are based on the use of study data downloaded from the dbGaP web site, under phs000373.v1.p1 (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000373.v1.p1). The research and analysis described in the present study were supported by a COBRE grant from the National Institute of General Medical Sciences (GR08954), the Genome Acquisition to Analytics (GAA) Research Core of the Personalized Medicine Center of Biomedical Research Excellence at the Nevada Institute of Personalized Medicine, and the National Supercomputing Institute at the University of Nevada Las Vegas. The funding sponsors were not involved in the analysis design, genotype imputation, data analysis, and interpretation of the analysis results or the preparation, review, or approval of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qing Wu.

Ethics declarations

Conflict of interest

Qing Wu, Fatma Nasoz, Jongyun Jung, Bibek Bhattarai and Mira V Han declare that they have no conflict of interest.

Human and Animal Rights and Informed Consent

This study analyzed de-identified, secondary data only, and was exempted by the Institutional Review Board at the University of Nevada, Las Vegas.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Q., Nasoz, F., Jung, J. et al. Machine Learning Approaches for Fracture Risk Assessment: A Comparative Analysis of Genomic and Phenotypic Data in 5130 Older Men. Calcif Tissue Int 107, 353–361 (2020). https://doi.org/10.1007/s00223-020-00734-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00223-020-00734-y

Keywords

Navigation