Skip to main content

Advertisement

Log in

Empirical Analysis of Machine Learning Algorithms on Imbalance Electrocardiogram Based Arrhythmia Dataset for Heart Disease Detection

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Living beings are subjected to many hazards during their course of life. Owing to high mortality rate, heart disease (HD) is among leading hazards for living being. It is world’s one of the critical disease due to its complex diagnosis and expansive treatment. It has predominantly affected the health care sector of developing as well as developed countries. Inadequate preventive measures, diagnosis shortcomings, inefficient medical support, lack of medical staff and advancements have led to severe impacts on developing countries. The paper exhibits state-of-the-art of various intelligent solutions for HD detection with an empirical analysis of machine learning algorithms on electrocardiogram-based arrhythmia dataset for disease detection. A critical investigation is being performed using eight machine learning algorithms, Support Vector Machine, K-Nearest Neighbors, Random Forest, Extra Tree, Bagging, Decision Tree, Linear Regression, and Adaptive Boosting, under imbalanced and balanced class paradigms. The performance of these algorithms is tested with four metrics namely, precision, recall, accuracy, and f1-score. The empirical analysis presents an interesting insight on the structure of dataset. Initially for binary class balancing problem majority class have more accuracy than the minority class because model’s training dataset is crowded with majority class tuples than minority class. The paper uses Synthetic Minority Over-sampling Technique for data balancing. It has not only increased the overall accuracy of the algorithm but also the individual accuracy of the classes. Hence, the accuracy of the minority class will not be sacrificed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Nashif, S.; Raihan, M.R.; Islam, M.R.; Imam, M.H.: Heart disease detection by using machine learning algorithms and a real-time cardiovascular health monitoring system. World J. Eng. Technol. 6(4), 854–873 (2018)

    Article  Google Scholar 

  2. Stefanovska, A.: Physics of the human cardiovascular system. Contemp. Phys. 40(1), 31–55 (1999)

    Article  Google Scholar 

  3. Mendis, S.; Puska, P.; Norrving, B.; World Health Organization: Global atlas on cardiovascular disease prevention and control. World Health Organization, Geneva (2011)

    Google Scholar 

  4. Najafi, F.; Jamrozik, K.; Dobson, A.J.: Understanding the ‘epidemic of heart failure’: a systematic review of trends in determinants of heart failure. Eur. J. Heart Fail. 11(5), 472–479 (2009)

    Article  Google Scholar 

  5. World Health Organization. (2020). Hearts: technical package for cardiovascular disease management in primary health care.

  6. World Health Organization. (2013). Global action plan for the prevention and control of noncommunicable diseases 2013–2020.

  7. Nikhar, S.; Karandikar, A.M.: Prediction of heart disease using machine learning algorithms. Int. J. Adv. Eng. Manag. Sci. 2(6), 239484 (2016)

    Google Scholar 

  8. Ketu, S.; Mishra, P.K.: Hybrid classification model for eye state detection using electroencephalogram signals. Cogn. Neurodyn. (2021). https://doi.org/10.1007/s11571-021-09678-x

  9. Ketu, S.; Mishra, P.K.: Performance analysis of machine learning algorithms for IoT-based human activity recognition. In: Advances in Electrical and Computer Technologies (pp. 579–591). Springer, Singapore (2020)

  10. Ketu, S.; Mishra, P.K.: Enhanced Gaussian process regression-based forecasting model for COVID-19 outbreak and significance of IoT for its detection. Appl. Intell. 51(3), 1492–1512 (2021)

    Article  Google Scholar 

  11. Ketu, S.; Mishra, P.K.: Scalable kernel-based SVM classification algorithm on imbalance air quality data for proficient healthcare. Complex Intell. Syst. (2021). https://doi.org/10.1007/s40747-021-00435-5

  12. Yu, S.N.; Lee, M.Y.: Bispectral analysis and genetic algorithm for congestive heart failure recognition based on heart rate variability. Comput. Biol. Med. 42(8), 816–825 (2012)

    Article  Google Scholar 

  13. Martis, R.J.; Acharya, U.R.; Mandana, K.M.; Ray, A.K.; Chakraborty, C.: Application of principal component analysis to ECG signals for automated diagnosis of cardiac health. Expert Syst. Appl. 39(14), 11792–11800 (2012)

    Article  Google Scholar 

  14. Pal, D.; Mandana, K.M.; Pal, S.; Sarkar, D.; Chakraborty, C.: Fuzzy expert system approach for coronary artery disease screening using clinical parameters. Knowl.-Based Syst. 36, 162–174 (2012)

    Article  Google Scholar 

  15. Yu, S.N.; Lee, M.Y.: Conditional mutual information-based feature selection for congestive heart failure recognition using heart rate variability. Comput. Methods Programs Biomed. 108(1), 299–309 (2012)

    Article  Google Scholar 

  16. Kim, J.K.; Lee, J.S.; Park, D.K.; Lim, Y.S.; Lee, Y.H.; Jung, E.Y.: Adaptive mining prediction model for content recommendation to coronary heart disease patients. Clust. Comput. 17(3), 881–891 (2014)

    Article  Google Scholar 

  17. Melillo, P.; De Luca, N.; Bracale, M.; Pecchia, L.: Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability. IEEE J. Biomed. Health Inform. 17(3), 727–733 (2013)

    Article  Google Scholar 

  18. Lainscsek, C.; Sejnowski, T.J.: Electrocardiogram classification using delay differential equations. Chaos Interdiscip J. Nonlinear Sci. 23(2), 023132 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  19. Mašetic, Z.; Subasi, A.: Detection of congestive heart failures using c4.5 decision tree. Southeast Eur. J. Soft Comput. 2(2), 74 (2013)

    Google Scholar 

  20. Guidi, G.; Pettenati, M.C.; Melillo, P.; Iadanza, E.: A machine learning system to improve heart failure patient assistance. IEEE J. Biomed. Health Inform. 18(6), 1750–1756 (2014)

    Article  Google Scholar 

  21. Liu, G.; Wang, L.; Wang, Q.; Zhou, G.; Wang, Y.; Jiang, Q.: A new approach to detect congestive heart failure using short-term heart rate variability measures. PLoS ONE 9(4), e93399 (2014)

    Article  Google Scholar 

  22. Vafaie, M.H.; Ataei, M.; Koofigar, H.R.: Heart diseases prediction based on ECG signals’ classification using a genetic-fuzzy system and dynamical model of ECG signals. Biomed. Signal Process. Control 14, 291–296 (2014)

    Article  Google Scholar 

  23. Long, N.C.; Meesad, P.; Unger, H.: A highly accurate firefly based algorithm for heart disease prediction. Expert Syst. Appl. 42(21), 8221–8231 (2015)

    Article  Google Scholar 

  24. Tay, D.; Poh, C.L.; Kitney, R.I.: A novel neural-inspired learning algorithm with application to clinical risk prediction. J. Biomed. Inform. 54, 305–314 (2015)

    Article  Google Scholar 

  25. Acharya, U.R.; Fujita, H.; Sudarshan, V.K.; Sree, V.S.; Eugene, L.W.J.; Ghista, D.N.; San Tan, R.: An integrated index for detection of sudden cardiac death using discrete wavelet transform and nonlinear features. Knowl.-Based Syst. 83, 149–158 (2015)

    Article  Google Scholar 

  26. Abdar, M.; Kalhori, S.R.N.; Sutikno, T.; Subroto, I.M.I.; Arji, G.: Comparing performance of data mining algorithms in prediction heart diseases. Int. J. Electr. Comput. Eng. 5(6), 1569–1576 (2015)

    Google Scholar 

  27. Saxena, K.; Sharma, R.: Efficient heart disease prediction system. Procedia Comput. Sci. 85, 962–969 (2016)

    Article  Google Scholar 

  28. Samuel, O.W.; Asogbon, G.M.; Sangaiah, A.K.; Fang, P.; Li, G.: An integrated decision support system based on ANN and Fuzzy_AHP for heart failure risk prediction. Expert Syst. Appl. 68, 163–172 (2017)

    Article  Google Scholar 

  29. Bashir, S.; Qamar, U.; Khan, F.H.: IntelliHealth: a medical decision support application using a novel weighted multi-layer classifier ensemble framework. J. Biomed. Inform. 59, 185–200 (2016)

    Article  Google Scholar 

  30. Fujita, H.; Acharya, U.R.; Sudarshan, V.K.; Ghista, D.N.; Sree, S.V.; Eugene, L.W.J.; Koh, J.E.: Sudden cardiac death (SCD) prediction based on nonlinear heart rate variability features and SCD index. Appl. Soft Comput. 43, 510–519 (2016)

    Article  Google Scholar 

  31. Taslimitehrani, V.; Dong, G.; Pereira, N.L.; Panahiazar, M.; Pathak, J.: Developing EHR-driven heart failure risk prediction models using CPXR (Log) with the probabilistic loss function. J. Biomed. Inform. 60, 260–269 (2016)

    Article  Google Scholar 

  32. Weng, C.H.; Huang, T.C.K.; Han, R.P.: Disease prediction with different types of neural network classifiers. Telematics Inform. 33(2), 277–292 (2016)

    Article  Google Scholar 

  33. Altan, G.; Kutlu, Y.; Allahverdi, N.: A new approach to early diagnosis of congestive heart failure disease by using Hilbert-Huang transform. Comput. Methods Programs Biomed. 137, 23–34 (2016)

    Article  Google Scholar 

  34. Masetic, Z.; Subasi, A.: Congestive heart failure detection using random forest classifier. Comput. Methods Programs Biomed. 130, 54–64 (2016)

    Article  Google Scholar 

  35. Leema, N.; Nehemiah, H.K.; Kannan, A.: Neural network classifier optimization using differential evolution with global information and back propagation algorithm for clinical datasets. Appl. Soft Comput. 49, 834–844 (2016)

    Article  Google Scholar 

  36. Arabasadi, Z.; Alizadehsani, R.; Roshanzamir, M.; Moosaei, H.; Yarifard, A.A.: Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm. Comput. Methods Programs Biomed. 141, 19–26 (2017)

    Article  Google Scholar 

  37. Dolatabadi, A.D.; Khadem, S.E.Z.; Asl, B.M.: Automated diagnosis of coronary artery disease (CAD) patients using optimized SVM. Comput. Methods Programs Biomed. 138, 117–126 (2017)

    Article  Google Scholar 

  38. Tayefi, M.; Tajfard, M.; Saffar, S.; Hanachi, P.; Amirabadizadeh, A.R.; Esmaeily, H.; Taghipour, A.; Ferns, G.A.; Moohebati, M.; Ghayour-Mobarhan, M.: hs-CRP is strongly associated with coronary heart disease (CHD): A data mining approach using decision tree algorithm. Comput. Methods Programs Biomed. 141, 105–109 (2017)

    Article  Google Scholar 

  39. Mustaqeem, A.; Anwar, S.M.; Khan, A.R.; Majid, M.: A statistical analysis based recommender model for heart disease patients. Int. J. Med. Inform. 108, 134–145 (2017)

    Article  Google Scholar 

  40. Mahajan, R.; Viangteeravat, T.; Akbilgic, O.: Improved detection of congestive heart failure via probabilistic symbolic pattern recognition and heart rate variability metrics. Int. J. Med. Inform. 108, 55–63 (2017)

    Article  Google Scholar 

  41. Sudarshan, V.K.; Acharya, U.R.; Oh, S.L.; Adam, M.; Tan, J.H.; Chua, C.K.; Chua, K.P.; San Tan, R.: Automated diagnosis of congestive heart failure using dual tree complex wavelet transform and statistical features extracted from 2 s of ECG signals. Comput. Biol. Med. 83, 48–58 (2017)

    Article  Google Scholar 

  42. Zhang, J.; Lafta, R.L.; Tao, X.; Li, Y.; Chen, F.; Luo, Y.; Zhu, X.: Coupling a fast fourier transformation with a machine learning ensemble model to support recommendations for heart disease patients in a telehealth environment. IEEE Access 5, 10674–10685 (2017)

    Article  Google Scholar 

  43. Mokeddem, S.A.: A fuzzy classification model for myocardial infarction risk assessment. Appl. Intell. 48(5), 1233–1250 (2018)

    Google Scholar 

  44. Boon, K.H.; Khalil-Hani, M.; Malarvili, M.B.: Paroxysmal atrial fibrillation prediction based on HRV analysis and non-dominated sorting genetic algorithm III. Comput. Methods Programs Biomed. 153, 171–184 (2018)

    Article  Google Scholar 

  45. Zheng, Y.; Guo, X.; Qin, J.; Xiao, S.: Computer-assisted diagnosis for chronic heart failure by the analysis of their cardiac reserve and heart sound characteristics. Comput. Methods Programs Biomed. 122(3), 372–383 (2015)

    Article  Google Scholar 

  46. Rasmy, L.; Wu, Y.; Wang, N.; Geng, X.; Zheng, W.J.; Wang, F.; Wu, H.; Xu, H.; Zhi, D.: A study of generalizability of recurrent neural network-based predictive models for heart failure onset risk using a large and heterogeneous EHR data set. J. Biomed. Inform. 84, 11–16 (2018)

    Article  Google Scholar 

  47. Aborokbah, M.M.; Al-Mutairi, S.; Sangaiah, A.K.; Samuel, O.W.: Adaptive context aware decision computing paradigm for intensive health care delivery in smart cities—a case analysis. Sustain. Cities Soc. 41, 919–924 (2018)

    Article  Google Scholar 

  48. Pławiak, P.: Novel methodology of cardiac health recognition based on ECG signals and evolutionary-neural system. Expert Syst. Appl. 92, 334–349 (2018)

    Article  Google Scholar 

  49. Tan, J.H.; Hagiwara, Y.; Pang, W.; Lim, I.; Oh, S.L.; Adam, M.; Tan, R.S.; Chen, M.; Acharya, U.R.: Application of stacked convolutional and long short-term memory network for accurate identification of CAD ECG signals. Comput. Biol. Med. 94, 19–26 (2018)

    Article  Google Scholar 

  50. Bozkurt, B.; Germanakis, I.; Stylianou, Y.: A study of time-frequency features for CNN-based automatic heart sound classification for pathology detection. Comput. Biol. Med. 100, 132–143 (2018)

    Article  Google Scholar 

  51. Miao, F.; Cai, Y.P.; Zhang, Y.X.; Fan, X.M.; Li, Y.: Predictive modeling of hospital mortality for patients with heart failure by using an improved random survival forest. IEEE Access 6, 7244–7253 (2018)

    Article  Google Scholar 

  52. Dominguez-Morales, J.P.; Jimenez-Fernandez, A.F.; Dominguez-Morales, M.J.; Jimenez-Moreno, G.: Deep neural networks for the recognition and classification of heart murmurs using neuromorphic auditory sensors. IEEE Trans. Biomed. Circuits Syst. 12(1), 24–34 (2017)

    Article  Google Scholar 

  53. Jin, B.; Che, C.; Liu, Z.; Zhang, S.; Yin, X.; Wei, X.: Predicting the risk of heart failure with EHR sequential data modeling. Ieee Access 6, 9256–9261 (2018)

    Article  Google Scholar 

  54. Yahaya, L.; Oye, N.D.; Garba, E.J.: A Comprehensive review on heart disease prediction using data mining and machine learning techniques. Am. J. Artif. Intell. 4(1), 20–29 (2020)

    Article  Google Scholar 

  55. Subhadra, K.; Vikas, B.: Neural network based intelligent system for predicting heart disease. Int. J. Innov. Technol. Exploring Eng. (IJITEE) 8(5), 484–487 (2019)

    Google Scholar 

  56. Ayatollahi, H.; Gholamhosseini, L.; Salehi, M.: Predicting coronary artery disease: a comparison between two data mining algorithms. BMC Public Health 19(1), 1–9 (2019)

    Article  Google Scholar 

  57. Padmanabhan, M.; Yuan, P.; Chada, G.; Nguyen, H.V.: Physician-friendly machine learning: A case study with cardiovascular disease risk prediction. J. Clin. Med. 8(7), 1050 (2019)

    Article  Google Scholar 

  58. Lakshmanarao, A.; Swathi, Y.; Sri, P.; Sundareswar, S.: Machine learning techniques for heart disease prediction. Int. J. Sci. Technol. Res. 8(11), 374–377 (2019)

    Google Scholar 

  59. Reddy, P.K.; Reddy, T.S.; Balakrishnan, S.; Basha, S.M.; Poluru, R.K.: Heart disease prediction using machine learning algorithm. Int. J. Innov. Technol. Explor. Eng. 8(10), 2603–2606 (2019)

    Article  Google Scholar 

  60. Annepu, D.; Gowtham, G.: Cardiovascular disease prediction using machine learning techniques. Int. Res. J. Eng. Technol. 6(4), 3963–3971 (2019)

    Google Scholar 

  61. MIT-BIH Arrhythmia Database Available Online: https://www.physionet.org/physiobank/database/mitdb/

  62. Heart Disease Data Set Available Online: https://archive.ics.uci.edu/ml/datasets/Heart+Disease

  63. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  MATH  Google Scholar 

  64. Fernández, A.; Garcia, S.; Herrera, F.; Chawla, N.V.: SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 61, 863–905 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  65. Bardenet, R.; Brendel, M.; Kégl, B.; Sebag, M. (2013) Collaborative hyperparameter tuning. In: International Conference on Machine Learning, pp. 199–207

  66. Yogatama, D.; Mann, G. (2014). Efficient transfer learning method for automatic hyperparameter tuning. In: Artificial Intelligence and Statistics, pp. 1077–1085

  67. Goutte, C.; Gaussier, E. (2005) A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: European Conference on Information Retrieval, pp. 345–359. Springer, Berlin

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shwet Ketu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ketu, S., Mishra, P.K. Empirical Analysis of Machine Learning Algorithms on Imbalance Electrocardiogram Based Arrhythmia Dataset for Heart Disease Detection. Arab J Sci Eng 47, 1447–1469 (2022). https://doi.org/10.1007/s13369-021-05972-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-021-05972-2

Keywords

Navigation