Skip to main content

Advertisement

Log in

Identification of significant risks in pediatric acute lymphoblastic leukemia (ALL) through machine learning (ML) approach

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

Pediatric acute lymphoblastic leukemia (ALL) through machine learning (ML) technique was analyzed to determine the significance of clinical and phenotypic variables as well as environmental conditions that can identify the underlying causes of child ALL. Fifty pediatric patients (n = 50) included who were diagnosed with acute lymphoblastic leukemia (ALL) according to the inclusion and exclusion criteria. Clinical variables comprised of the blood biochemistry (CBC, LFTs, RFTs) results, and distribution of type of ALL, i.e., T ALL or B ALL. Phenotypic data included the age, sex of the child, and consanguinity, while environmental factors included the habitat, socioeconomic status, and access to filtered drinking water. Fifteen different features/attributes were collected for each case individually. To retrieve most useful discriminating attributes, four different supervised ML algorithms were used including classification and regression trees (CART), random forest (RM), gradient boosted machine (GM), and C5.0 decision tree algorithm. To determine the accuracy of the derived CART algorithm on future data, a ten-fold cross validation was performed on the present data set. The ALL was common in children of age below 5 years in male patients whole belonged to middle class family of rural areas. (B-ALL) was most frequent as compared with T-ALL. The consanguinity was present in 54% of cases. Low levels of platelets and hemoglobin and high levels of white blood cells were reported in child ALL patients. CART provided the best and complete fit for the entire data set yielding a 99.83% model fit accuracy, and a misclassification of 0.17% on the entire sample space, while C5.0 reported 98.6%, random forest 94.44%, and gradient boosted machine resulted in 95.61% fitting. The variable importance of each primary discriminating attribute is platelet 43%, hemoglobin 24%, white blood cells 4%, and sex of the child 4%. An overall accuracy of 87.4% was recorded for the classifier. Platelet count abnormality can be considered as a major factor in predicting pediatric ALL. The machine learning algorithms can be applied efficiently to provide details for the prognosis for better treatment outcome.

Identification of significant risks in pediatric acute lymphoblastic leukemia (ALL) through machine learning (ML) approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Chang JS, Wiemels JL, Chokkalingam AP, Metayer C, Barcellos LF, Hansen HM et al (2010) Genetic polymorphisms in adaptive immunity genes and childhood acute lymphoblastic leukemia. Cancer Epidemiol Biomark Prev 19(9):2152–2163

    CAS  Google Scholar 

  2. Smith MA, Seibel NL, Altekruse SF, Ries LA, Melbert DL, O'Leary M et al (2010) Outcomes for children and adolescents with cancer: challenges for the twenty-first century. J Clin Oncol 28(15):2625

    PubMed  PubMed Central  Google Scholar 

  3. Mushtaq N, Fadoo Z, Naqvi A (2013) Childhood acute lymphoblastic leukaemia: experience from a single tertiary care facility of Pakistan. J Pak Med Assoc 63(11):1399–404

  4. Fadoo Z, Nisar I, Yousuf F, Lakhani LS, Ashraf S, Imam U et al (2015) Clinical features and induction outcome of childhood acute lymphoblastic leukemia in a lower/middle income population: a multi-institutional report from Pakistan. Pediatr Blood Cancer 62(10):1700–1708

    CAS  PubMed  Google Scholar 

  5. Awan T, Iqbal Z, Aleem A, Sabir N, Absar M, Rasool M et al (2012) Five most common prognostically important fusion oncogenes are detected in the majority of Pakistani pediatric acute lymphoblastic leukemia patients and are strongly associated with disease biology and treatment outcome. Asian Pac J Cancer Prev 13(11):5469–5475

    PubMed  Google Scholar 

  6. Shaikh MS, Ali SS, Khurshid M, Fadoo Z (2014) Chromosomal abnormalities in Pakistani children with acute lymphoblastic leukemia. Asian Pac J Cancer Prev 15(9):3907–3909

    PubMed  Google Scholar 

  7. Iacobucci I, Papayannidis C, Lonetti A, Ferrari A, Baccarani M, Martinelli G (2012) Cytogenetic and molecular predictors of outcome in acute lymphocytic leukemia: recent developments. Curr Hematol Malign Rep 7(2):133–143

    Google Scholar 

  8. Zhang J, Mullighan CG, Harvey RC, Wu G, Chen X, Edmonson M et al (2011) Key pathways are frequently mutated in high-risk childhood acute lymphoblastic leukemia: a report from the Children's Oncology Group. Blood 118(11):3080–3087

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Pui C-H, Robison LL, Look AT (2008) Acute lymphoblastic leukaemia. Lancet 371(9617):1030–1043

    CAS  PubMed  Google Scholar 

  10. Jameson JL, Weetman AP, Fausi A, Braunwald E, Kasper D, Hauser SL, et al (2018) Harrison's principles of internal medicine. 20th Edition. McGraw-Hill Education, New York. pp. 757–760

  11. Sinnett D, Krajinovic M, Labuda D (2000) Genetic susceptibility to childhood acute lymphoblastic leukemia. Leuk Lymphoma 38(5–6):447–462

    CAS  PubMed  Google Scholar 

  12. Yasmeen N, Ashraf S (2009) Childhood acute lymphoblastic leukaemia; epidemiology and clinicopathological features. Journal of Pakistan Medical Association (JPMA) 59(3):150–153

  13. Jensen CD, Block G, Buffler P, Ma X, Selvin S, Month S (2004) Maternal dietary risk factors in childhood acute lymphoblastic leukemia (United States). Cancer Causes Control 15(6):559–570

    PubMed  Google Scholar 

  14. Urayama KY, Wiencke JK, Buffler PA, Chokkalingam AP, Metayer C, Wiemels JL (2007) MDR1 gene variants, indoor insecticide exposure, and the risk of childhood acute lymphoblastic leukemia. Cancer Epidemiol Biomark Prev 16(6):1172–1177

    CAS  Google Scholar 

  15. Buffler PA, Kwan ML, Reynolds P, Urayama KY (2005) Environmental and genetic risk factors for childhood leukemia: appraising the evidence. Cancer Investig 23(1):60–75

    CAS  Google Scholar 

  16. Murray L, McCarron P, Bailie K, Middleton R, Smith GD, Dempsey S et al (2002) Association of early life factors and acute lymphoblastic leukaemia in childhood: historical cohort study. Br J Cancer 86(3):356–361

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Belson M, Kingsley B, Holmes A (2007) Risk factors for acute leukemia in children: a review. Environ Health Perspect 15(1):138–145

  18. Wiemels J, Wrensch M, Claus EB (2010) Epidemiology and etiology of meningioma. J Neuro-Oncol 99(3):307–314

    Google Scholar 

  19. Krajinovic M, Richer C, Sinnett H, Labuda D, Sinnett D (2000) Genetic polymorphisms of N-acetyltransferases 1 and 2 and gene-gene interaction in the susceptibility to childhood acute lymphoblastic leukemia. Cancer Epidemiol Biomark Prev 9(6):557–562

    CAS  Google Scholar 

  20. Therneau T, Atkinson BR, B Riply (2019) Recursive Partitioning and Regression Trees. R package verion 4.1-10. Available from: https://cran.r-project.org/package=rpart

  21. Liaw A, Wiener M (2012) Random Forest: Breiman and Cutler’s random forests for classification and regression. R Package Version 4.6–7. Available: http://cran.r-project.org/web/packages/randomForest/. Accessed 12 Nov 2019

  22. Greenwell B, Boehmke J (2019) Cunningham, and G. B. M. Developers (2019) gbm: Generalized Boosted Regression Models. R package version. Available: https://cran.r-project.org/web/packages/gbm/gbm.pdf. Accessed 20 Oct 2019

  23. Kuhn M, Weston S, Culp M, Coulter (2018) C50: C5.0 Decision Trees and Rule-Based Models. Available at: https://cran.r-project.org/web/packages/C50/index.html. Accessed 20 Jun 2018

  24. Pan L, Liu G, Lin F, Zhong S, Xia H, Sun X et al (2017) Machine learning applications for prediction of relapse in childhood acute lymphoblastic leukemia. Sci Rep 7(1):7402

    PubMed  PubMed Central  Google Scholar 

  25. Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI (2015) Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J 13:8–17

    CAS  PubMed  Google Scholar 

  26. Hosking FJ, Papaemmanuil E, Sheridan E, Kinsey SE, Lightfoot T, Roman E et al (2010) Genome-wide homozygosity signatures and childhood acute lymphoblastic leukemia risk. Blood 115(22):4472–4477

    CAS  PubMed  Google Scholar 

  27. Breit S, Stanulla M, Flohr T, Schrappe M, Ludwig W-D, Tolle G et al (2006) Activating NOTCH1 mutations predict favorable early treatment response and long-term outcome in childhood precursor T-cell lymphoblastic leukemia. Blood 108(4):1151–1157

    CAS  PubMed  Google Scholar 

  28. Koju S, Sachdeva MUS, Bose P, Varma N (2015) Spectrum of acute leukemias diagnosed on flow cytometry: analysis from tertiary care centre from North India. Ann Clin Chem Lab Med 1(1):12–15

    Google Scholar 

  29. Mullighan CG, Collins-Underwood JR, Phillips LA, Loudin MG, Liu W, Zhang J et al (2009) Rearrangement of CRLF2 in B-progenitor–and Down syndrome–associated acute lymphoblastic leukemia. Nat Genet 41(11):1243–1246

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Harrison CJ, Moorman AV, Barber KE, Broadfield ZJ, Cheung KL, Harris RL et al (2005) Interphase molecular cytogenetic screening for chromosomal abnormalities of prognostic significance in childhood acute lymphoblastic leukaemia: a UK Cancer Cytogenetics Group Study. Br J Haematol 129(4):520–530

    PubMed  Google Scholar 

  31. Moorman AV, Ensor HM, Richards SM, Chilton L, Schwab C, Kinsey SE et al (2010) Prognostic effect of chromosomal abnormalities in childhood B-cell precursor acute lymphoblastic leukaemia: results from the UK Medical Research Council ALL97/99 randomised trial. Lancet Oncol 11(5):429–438

    CAS  PubMed  Google Scholar 

  32. Sherborne AL, Hosking FJ, Prasad RB, Kumar R, Koehler R, Vijayakrishnan J et al (2010) Variation in CDKN2A at 9p21. 3 influences childhood acute lymphoblastic leukemia risk. Nat Genet 42(6):492–494

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Petridou E, Alexander FE, Trichopoulos D, Revinthi K, Dessypris N, Wray N et al (1997) Aggregation of childhood leukemia in geographic areas of Greece. Cancer Causes Control 8(2):239–245

    CAS  PubMed  Google Scholar 

  34. Kinlen L (1988) Evidence for an infective cause of childhood leukaemia: comparison of a Scottish new town with nuclear reprocessing sites in Britain. Lancet 332(8624):1323–1327

    Google Scholar 

  35. Castro-Jiménez MÁ, Orozco-Vargas LC (2011) Parental exposure to carcinogens and risk for childhood acute lymphoblastic leukemia, Colombia, 2000-2005. Prev Chronic Dis 8(5)A106:1–14

  36. Viana MB, Fernandes RAF, De Carvalho RI, Murao M (1998) Low socioeconomic status is a strong independent predictor of relapse in childhood acute lymphoblastic leukemia. Int J Cancer 78(S11):56–61

    Google Scholar 

  37. Bhatia S (2004) Influence of race and socioeconomic status on outcome of children treated for childhood acute lymphoblastic leukemia. Curr Opin Pediatr 16(1):9–14

    PubMed  Google Scholar 

  38. Mostert S, Sitaresmi MN, Gundy CM, Veerman AJ (2006) Influence of socioeconomic status on childhood acute lymphoblastic leukemia treatment in Indonesia. Pediatrics 118(6):e1600–e16e6

    PubMed  Google Scholar 

  39. Westergaard T, Frisch M, Pedersen JB, Wohlfahrt J, Melbye M, Andersen PK et al (1997) Birth characteristics, sibling patterns, and acute leukemia risk in childhood: a population-based cohort study. J Natl Cancer Inst 89(13):939–947

    CAS  PubMed  Google Scholar 

  40. Canalle R, Burim RV, Tone LG, Takahashi CS (2004) Genetic polymorphisms and susceptibility to childhood acute lymphoblastic leukemia. Environ Mol Mutagen 43(2):100–109

    CAS  PubMed  Google Scholar 

  41. Mcnally RJ, Parker L (2006) Environmental factors and childhood acute leukemias and lymphomas. Leuk Lymphoma 47(4):583–598

    PubMed  Google Scholar 

  42. Costas K, Knorr RS, Condon SK (2002) A case–control study of childhood leukemia in Woburn, Massachusetts: the relationship between leukemia incidence and exposure to public drinking water. Sci Total Environ 300(1):23–35

    CAS  PubMed  Google Scholar 

  43. Kasim K, Levallois P, Johnson KC, Abdous B, Auger P, Group CCRER (2006) Chlorination disinfection by-products in drinking water and the risk of adult leukemia in Canada. Am J Epidemiol 163(2):116–126

    PubMed  Google Scholar 

  44. Infante-Rivard C, Olson E, Jacques L, Ayotte P (2001) Drinking water contaminants and childhood leukemia. Epidemiology 12(1):13–19

    CAS  PubMed  Google Scholar 

  45. Smith AH, Steinmaus CM (2009) Health effects of arsenic and chromium in drinking water: recent human findings. Annu Rev Public Health 30:107

    PubMed  PubMed Central  Google Scholar 

  46. Kchour G, Tarhini M, Kooshyar M-M, El Hajj H, Wattel E, Mahmoudi M et al (2009) Phase 2 study of the efficacy and safety of the combination of arsenic trioxide, interferon alpha, and zidovudine in newly diagnosed chronic adult T-cell leukemia/lymphoma (ATL). Blood 113(26):6528–6532

    CAS  PubMed  Google Scholar 

  47. Rasheed A, Iqtidar A, Khan S (1996) Hematological and biochemical changes in acute leukemic patients after chemotherapy. Zhongguo Yao li xue bao=. Acta Pharmacol Sin 17(3):207–208

    CAS  Google Scholar 

  48. Caruso V, Iacoviello L, Di Castelnuovo A, Storti S, Mariani G, de Gaetano G et al (2006) Thrombotic complications in childhood acute lymphoblastic leukemia: a meta-analysis of 17 prospective studies comprising 1752 pediatric patients. Blood 108(7):2216–2222

    CAS  PubMed  Google Scholar 

  49. Bostrom BC, Sensel MR, Sather HN, Gaynon PS, La MK, Johnston K et al (2003) Dexamethasone versus prednisone and daily oral versus weekly intravenous mercaptopurine for patients with standard-risk acute lymphoblastic leukemia: a report from the Children's Cancer Group. Blood 101(10):3809–3817

    CAS  PubMed  Google Scholar 

  50. Ribera J-M, Oriol A, Sanz M-A, Tormo M, Fernández-Abellán P, del Potro E et al (2008) Comparison of the results of the treatment of adolescents and young adults with standard-risk acute lymphoblastic leukemia with the Programa Espanol de Tratamiento en Hematologia pediatric-based protocol ALL-96. J Clin Oncol 26(11):1843–1849

    CAS  PubMed  Google Scholar 

  51. Hann I, Vora A, Harrison G, Harrison C, Eden O, Hill F et al (2001) Determinants of outcome after intensified therapy of childhood lymphoblastic leukaemia: results from Medical Research Council United Kingdom acute lymphoblastic leukaemia XI protocol. Br J Haematol 113(1):103–114

    CAS  PubMed  Google Scholar 

  52. Wayne AS, Bhojwani D, Silverman LB, Richards K, Stetler-Stevenson M, Shah NN et al (2011) A novel anti-CD22 immunotoxin, moxetumomab pasudotox: phase I study in pediatric acute lymphoblastic leukemia (ALL). Blood 118(21):248

    Google Scholar 

  53. Lowe EJ, Pui CH, Hancock ML, Geiger TL, Khan RB, Sandlund JT (2005) Early complications in children with acute lymphoblastic leukemia presenting with hyperleukocytosis. Pediatr Blood Cancer 45(1):10–15

    PubMed  Google Scholar 

  54. Athale UH, Chan AK (2003) Thrombosis in children with acute lymphoblastic leukemia: part I. epidemiology of thrombosis in children with acute lymphoblastic leukemia. Thromb Res 111(3):125–131

    CAS  PubMed  Google Scholar 

  55. Mitchell L, Hoogendoorn H, Giles AR, Vegh P, Andrew M (1994) Increased endogenous thrombin generation in children with acute lymphoblastic leukemia: risk of thrombotic complications in L'Asparaginase-induced antithrombin III deficiency. Blood 83(2):386–391

    CAS  PubMed  Google Scholar 

  56. Aricò M, Valsecchi MG, Camitta B, Schrappe M, Chessells J, Baruchel A et al (2000) Outcome of treatment in children with Philadelphia chromosome–positive acute lymphoblastic leukemia. N Engl J Med 342(14):998–1006

    PubMed  Google Scholar 

  57. Nishimoto N, Imai Y, Ueda K, Nakagawa M, Shinohara A, Ichikawa M, Nannya Y, Kurokawa M (2010) T cell acute lymphoblastic leukemia arising from familial platelet disorder. Int J Hematol 92(1):194–197

    CAS  PubMed  Google Scholar 

  58. Noetzli L, Lo RW, Lee-Sherick AB, Callaghan M, Noris P, Savoia A, Rajpurkar M, Jones K, Gowan K, Balduini CL (2015) Germline mutations in ETV6 are associated with thrombocytopenia, red cell macrocytosis and predisposition to lymphoblastic leukemia. Nat Genet 47(5):535–538

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Mohapatra S, Patra D, Satpathy S (2014) An ensemble classifier system for early diagnosis of acute lymphoblastic leukemia in blood microscopic images. Neural Comput & Applic 24(7–8):1887–1904

    Google Scholar 

  60. Abdeldaim AM, Sahlol AT, Elhoseny M, Hassanien AE (2018) Computer-aided acute lymphoblastic leukemia diagnosis system based on image analysis. In: Advances in soft computing and machine learning in image processing. A.E. Hassanien and D.A. Oliva (eds.). Springer International Publishing AG pp 131–147

  61. Jagadev P, Virani H Detection of leukemia and its types using image processing and machine learning. In: 2017 International Conference on Trends in Electronics and Informatics (ICEI), 2017 IEEE, pp. 522–526

  62. Li J, Liu H, Downing JR, Yeoh AE-J, Wong L (2003) Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients. Bioinformatics 19(1):71–78

    PubMed  Google Scholar 

  63. Fuse K, Uemura S, Tamura S, Suwabe T, Katagiri T, Tanaka T, Ushiki T, Shibasaki Y, Sato N, Yano T (2019) Patient-based prediction algorithm of relapse after allo-HSCT for acute leukemia and its usefulness in the decision-making process using a machine learning approach. Cancer Med 8(11):5058–5067

    PubMed  PubMed Central  Google Scholar 

  64. Lee S-I, Celik S, Logsdon BA, Lundberg SM, Martins TJ, Oehler VG, Estey EH, Miller CP, Chien S, Dai J (2018) A machine learning approach to integrate big data for precision medicine in acute myeloid leukemia. Nat Commun 9(1):1–13

    Google Scholar 

Download references

Acknowledgements

We are thankful to the Department of Hematology and Oncology, Children Hospital and Institute of Child Health, Lahore for the provision of data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nasir Mahmood.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

The study conformed to the institute’s ethical standards.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahmood, N., Shahid, S., Bakhshi, T. et al. Identification of significant risks in pediatric acute lymphoblastic leukemia (ALL) through machine learning (ML) approach. Med Biol Eng Comput 58, 2631–2640 (2020). https://doi.org/10.1007/s11517-020-02245-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-020-02245-2

Keywords

Navigation