Abstract
The study aims were to develop fracture prediction models by using machine learning approaches and genomic data, as well as to identify the best modeling approach for fracture prediction. The genomic data of Osteoporotic Fractures in Men, cohort Study (n = 5130), were analyzed. After a comprehensive genotype imputation, genetic risk score (GRS) was calculated from 1103 associated Single Nucleotide Polymorphisms for each participant. Data were normalized and split into a training set (80%) and a validation set (20%) for analysis. Random forest, gradient boosting, neural network, and logistic regression were used to develop prediction models for major osteoporotic fractures separately, with GRS, bone density, and other risk factors as predictors. In model training, the synthetic minority oversampling technique was used to account for low fracture rate, and tenfold cross-validation was employed for hyperparameters optimization. In the testing, the area under curve (AUC) and accuracy were used to assess the model performance. The McNemar test was employed to examine the accuracy difference between models. The results showed that the prediction performance of gradient boosting was the best, with AUC of 0.71 and an accuracy of 0.88, and the GRS ranked as the 7th most important variable in the model. The performance of random forest and neural network were also significantly better than that of logistic regression. This study suggested that improving fracture prediction in older men can be achieved by incorporating genetic profiling and by utilizing the gradient boosting approach. This result should not be extrapolated to women or young individuals.
Similar content being viewed by others
Abbreviations
- MrOS:
-
Osteoporotic fractures in men study
- ML:
-
Machine learning
- BMD:
-
Bone mineral density
- FRAX:
-
The fracture risk assessment tool
- GRS:
-
Genetic risk score
- QUS:
-
Quantitative ultrasound
- ROC:
-
Receiver-operating curve
- AUC:
-
Area under curve
- LR:
-
Logistic regression
- RF:
-
Random forest
- GB:
-
Gradient boosting
- NN:
-
Neural network
- MOF:
-
Major osteoporotic fracture
- SNPs:
-
Single nucleotide polymorphisms
- FNBMD:
-
Femoral neck BMD
- TSBMD:
-
Total spine BMD
- THBMD:
-
Total hip BMD
- SOS:
-
Speed of sound
- BUA:
-
Broadband ultrasonic attenuation
- QUI:
-
Quantitative ultrasonic index
References
Johnell O, Kanis JA (2006) An estimate of the worldwide prevalence and disability associated with osteoporotic fractures. Osteoporos Int 17(12):1726–1733
Melton LJ, Cooper C (2007) Chapter 21—Magnitude and impact of osteoporosis and fractures osteoporosis., 2nd edn, Academic Press Inc, San Diego, pp 557–567
Boonen S et al (2012) Fracture risk and zoledronic acid therapy in men with osteoporosis. N Engl J Med 367(18):1714–1723
Jiang HX et al (2005) Development and initial validation of a risk score for predicting in-hospital and 1-year mortality in patients with hip fractures. J Bone Miner Res 20(3):494–500
Papaioannou A et al (2009) Risk factors for low BMD in healthy men age 50 years or older: a systematic review. Osteoporos Int 20(4):507–518
Kanis JA, Johnell O, Oden A, Johansson H, McCloskey E (2008) FRAXTM and the assessment of fracture probability in men and women from the UK. Osteoporos Int 19:385–397
McCloskey EV, Johansson H, Oden A, Kanis JA (2009) From relative risk to absolute fracture risk calculation: the FRAX algorithm. Curr Osteoporos Rep 7(3):77–83
Morris JA et al (2019) An atlas of genetic influences on osteoporosis in humans and mice. Nat Genet 51(2):258–266
Ralston SH, Uitterlinden AG (2010) Genetics of osteoporosis. Endocr Rev 31(5):629–662
Hsu YH et al (2010) An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility loci for osteoporosis-related traits. PLoS Genet 6(6):1–16
Kim SK (2018) Identification of 613 new loci associated with heel bone mineral density and a polygenic risk score for bone mineral density, osteoporosis and fracture. PLoS ONE 13(7):e0200785
Hsieh CH, Lu RH, Lee NH, Chiu WT, Hsu MH, Li YC (2011) Novel solutions for an old disease: diagnosis of acute appendicitis with random forest, support vector machines, and artificial neural networks. Surgery 149(1):87–93
Orwoll E et al (2005) Design and baseline characteristics of the osteoporotic fractures in men (MrOS) study—A large observational study of the determinants of fracture in older men. Contemp Clin Trials 26:569–585
Blank JB et al (2005) Overview of recruitment for the osteoporotic fractures in men study (MrOS). Contemp Clin Trials 26(5):557–568
Cauley JA et al (2005) Factors associated with the lumbar spine and proximal femur bone mineral density in older men. Osteoporos Int 16(12):1525–1537
Bauer DC, Ewing SK, Cauley JA, Ensrud KE, Cummings SR, Orwoll ES (2007) Quantitative ultrasound predicts hip and non-spine fracture in men: the MrOS study. Osteoporos Int 18(6):771–777
Lix LM, Leslie WD, Majumdar SR (2018) Measuring improvement in fracture risk prediction for a new risk factor: a simulation. BMC Res Notes 11:62
Andrews NA (2010) Genome-wide association studies in the osteoporosis field: Impressive technological achievements, but an uncertain future in the clinical setting. IBMS BoneKEy 7(11):382–387
Melton LJ, Atkinson EJ, O’Fallon WM, Wahner HW, Riggs BL (1993) Long-term fracture prediction by bone mineral assessed at different skeletal sites. J Bone Miner Res 8(10):1227–1233
Kanis JA et al (2005) Assessment of fracture risk. Osteoporos Int 16(6):581–589
Stone KL et al (2003) BMD at multiple sites and risk of fracture of multiple types: long-term results from the study of osteoporotic fractures. J Bone Miner Res 18(9):1947–1954
Iniesta R, Stahl D, McGuffin P (2016) Machine learning, statistical learning and the future of biological research in psychiatry. Psychol Med 46(12):2455–2465
Sun Y, Kamel MS, Wong AKC, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn 40(12):3358–3378
Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Handling imbalanced datasets : a review. GESTS Int Trans Comput Sci Eng 30(1):25–36
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique nitesh. J Artif Intell Res 16(1):321–357
Raschka S (2018) Model evaluation , model selection , and algorithm selection in machine learning. CoRR abs/1811.12808.
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011) Scikit-learn: machine learning in Python. J. Mach Learn Res 12:2825–2830
Bolland MJ et al (2011) Evaluation of the FRAX and Garvan fracture risk calculators in older women. J Bone Miner Res 26(2):420–427
Al-Barghouthi BM, Farber CR (2019) Dissecting the genetics of osteoporosis using systems approaches. Trends Genet 35(1):55–67
Eriksson J et al (2015) Limited clinical utility of a genetic risk score for the prediction of fracture risk in elderly subjects. J Bone Miner Res 30(1):184–194
Ho-Le TP, Center JR, Eisman JA, Nguyen HT, Nguyen TV (2017) Prediction of bone mineral density and fragility fracture by genetic profiling. J Bone Miner Res 32(2):285–293
Estrada K et al (2012) Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture. Nat Genet 44(5):491–501
Taylor RA, Moore CL, Cheung KH, Brandt C (2018) Predicting urinary tract infections in the emergency department with machine learning. PLoS ONE 13(3):1–15
Kruse C, Eiken P, Vestergaard P (2017) Machine learning principles can improve hip fracture prediction. Calcif Tissue Int 100(4):348–360
Sato M et al (2019) Machine-learning approach for the development of a novel predictive model for the diagnosis of hepatocellular carcinoma. Sci Rep 9(1):1–7
Chiew CJ, Liu N, Tagami T, Wong TH, Koh ZX, Ong MEH (2019) Heart rate variability based machine learning models for risk prediction of suspected sepsis patients in the emergency department. Medicine 98(6):e14197
Babajide Mustapha I, Saeed F (2016) Bioactive molecule prediction using extreme gradient boosting. Molecules (Basel, Switzerland) 21(8):1–11
Cummings SR et al (1993) Bone density at various sites for prediction of hip fractures. The Lancet 341(8837):72–75
Beleites C, Neugebauer U, Bocklitz T, Krafft C, Popp J (2015) Sample size planning for classification models. Anal Chim Acta 760:25–33
Nguyen TV, Eisman JA (2013) Genetic profiling and individualized assessment of fracture risk. Nat Rev Endocrinol 9(3):153–161
Acknowledgements
The data/analyses presented in the current publication are based on the use of study data downloaded from the dbGaP web site, under phs000373.v1.p1 (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000373.v1.p1). The research and analysis described in the present study were supported by a COBRE grant from the National Institute of General Medical Sciences (GR08954), the Genome Acquisition to Analytics (GAA) Research Core of the Personalized Medicine Center of Biomedical Research Excellence at the Nevada Institute of Personalized Medicine, and the National Supercomputing Institute at the University of Nevada Las Vegas. The funding sponsors were not involved in the analysis design, genotype imputation, data analysis, and interpretation of the analysis results or the preparation, review, or approval of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Qing Wu, Fatma Nasoz, Jongyun Jung, Bibek Bhattarai and Mira V Han declare that they have no conflict of interest.
Human and Animal Rights and Informed Consent
This study analyzed de-identified, secondary data only, and was exempted by the Institutional Review Board at the University of Nevada, Las Vegas.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wu, Q., Nasoz, F., Jung, J. et al. Machine Learning Approaches for Fracture Risk Assessment: A Comparative Analysis of Genomic and Phenotypic Data in 5130 Older Men. Calcif Tissue Int 107, 353–361 (2020). https://doi.org/10.1007/s00223-020-00734-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00223-020-00734-y