Skip to main content
Log in

Adaptive Decision Threshold-Based Extreme Learning Machine for Classifying Imbalanced Multi-label Data

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Multi-label learning is a popular area of machine learning research as it is widely applicable to many real-world scenarios. In comparison with traditional binary and multi-classification tasks, the multi-label data are more easily impacted or destroyed by an imbalanced data distribution. This paper describes an adaptive decision threshold-based extreme learning machine algorithm (ADT-ELM) that addresses the imbalanced multi-label data classification problem. Specifically, the macro and micro F-measure metrics are adopted as the optimization functions for ADT-ELM, and the particle swarm optimization algorithm is employed to determine the optimal decision threshold combination. We use the optimized thresholds to make decision for future multi-label instances. Twelve baseline multi-label data sets are used in a series of experiments o verify the effectiveness and superiority of the proposed algorithm. The experimental results indicate that the proposed ADT-ELM algorithm is significantly superior to many state-of-the-art multi-label imbalance learning algorithms, and it generally requires less training time than more sophisticated algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Zhang ML, Zhou ZH (2013) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(3):1819–1837

    Google Scholar 

  2. Cheng X, Zhao SG, Xiao X, Chou KC (2016) iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals. Bioinformatics 33(3):341–346

    Google Scholar 

  3. Fu H, Cheng J, Xu Y, Wong DWK, Liu J, Cao X (2018) Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE Trans Med Imaging 37(7):1597–1605

    Google Scholar 

  4. Bogaert M, Lootens J, Van den Poel D, Ballings M (2019) Evaluating multi-label classifiers and recommender systems in the financial service sector. Eur J Oper Res 279(2):620–634

    Google Scholar 

  5. Li SY, Jiang Y, Chawla NV, Zhou ZH (2018) Multi-label learning from crowds. IEEE Trans Knowl Data Eng 31(7):1369–1382

    Google Scholar 

  6. Rubin TN, Chambers A, Smyth P, Steyvers M (2012) Statistical topic models for multi-label document classification. Mach Learn 88(1–2):157–208

    MathSciNet  MATH  Google Scholar 

  7. Guo L, Jin B, Yu R, Yao C, Sun C, Huang D (2016) Multi-label classification methods for green computing and application for mobile medical recommendations. IEEE Access 4:3201–3209

    Google Scholar 

  8. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    MATH  Google Scholar 

  9. Yu H, Ni J, Zhao J (2013) ACOSampling: an ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data. Neurocomputing 101:309–318

    Google Scholar 

  10. Sun J, Lang J, Fujita H, Li H (2018) Imbalanced enterprise credit evaluation with DTE-SBD: decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Inf Sci 425:76–91

    MathSciNet  Google Scholar 

  11. Piri S, Delen D, Liu T (2018) A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets. Decis Support Syst 106:15–29

    Google Scholar 

  12. Kang Q, Chen X, Li X, Zhou M (2016) A noise-filtered under-sampling scheme for imbalanced classification. IEEE Trans Cybern 47(12):4263–4274

    Google Scholar 

  13. López V, Del Río S, Benítez JM, Herrera F (2015) Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data. Fuzzy Sets Syst 258:5–38

    MathSciNet  Google Scholar 

  14. Zhang C, Tan KC, Li H, Hong GS (2018) A cost-sensitive deep belief network for imbalanced classification. IEEE Trans Neural Netw Learn Syst 30(1):109–122

    Google Scholar 

  15. Datta S, Das S (2015) Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs. Neural Netw 70:39–52

    MATH  Google Scholar 

  16. Yu H, Sun C, Yang X, Zheng S, Zou H (2019) Fuzzy support vector machine with relative density information for classifying imbalanced data. IEEE Trans Fuzzy Syst 27(12):2353–2367

    Google Scholar 

  17. Yu H, Sun C, Yang X, Yang W, Shen J, Qi Y (2016) ODOC-ELM: optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data. Knowl-Based Syst 92:55–70

    Google Scholar 

  18. Yu H, Mu C, Sun C, Yang W, Yang X, Zuo X (2015) Support vector machine-based optimized decision threshold adjustment strategy for classifying imbalanced data. Knowl-Based Syst 76:67–78

    Google Scholar 

  19. Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63–77

    MathSciNet  Google Scholar 

  20. Collell G, Prelec D, Patil KR (2018) A simple plug-in bagging ensemble based on threshold-moving for classifying binary and multiclass imbalanced data. Neurocomputing 275:330–340

    Google Scholar 

  21. Zhang J, Wang K, Zhu W, Zhong P (2015) Least squares fuzzy one-class support vector machine for imbalanced data. Int J Signal Process Image Process Pattern Recogn 8(8):299–308

    Google Scholar 

  22. Yu H, Sun D, Xi X, Yang X, Zheng S, Wang Q (2019) Fuzzy one-class extreme auto-encoder. Neural Process Lett 50(1):701–727

    Google Scholar 

  23. Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C (Appl Rev) 42(4):463–484

    Google Scholar 

  24. Wang S, Minku LL, Yao X (2015) Resampling-based ensemble methods for online class imbalance learning. IEEE Trans Knowl Data Eng 27(5):1356–1368

    Google Scholar 

  25. Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2010) RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybern Part A Syst Hum 40(1):185–197

    Google Scholar 

  26. Lim P, Goh CK, Tan KC (2016) Evolutionary cluster-based synthetic oversampling ensemble (eco-ensemble) for imbalance learning. IEEE Trans Cybern 47(9):2850–2861

    Google Scholar 

  27. Sun Z, Song Q, Zhu X, Sun H, Xu B, Zhou Y (2015) A novel ensemble method for classifying imbalanced data. Pattern Recogn 48(5):1623–1637

    Google Scholar 

  28. Yu H, Ni J (2014) An improved ensemble learning method for classifying high-dimensional and imbalanced biomedicine data. IEEE/ACM Trans Comput Biol Bioinf 11(4):657–666

    Google Scholar 

  29. Huda S, Liu K, Abdelrazek M, Ibrahim A, Alyahya S, Al-Dossari H, Ahmad S (2018) An ensemble oversampling model for class imbalance problem in software defect prediction. IEEE Access 6:24184–24195

    Google Scholar 

  30. Tahir MA, Kittler J, Yan F (2012) Inverse random under sampling for class imbalance problem and its application to multi-label classification. Pattern Recogn 45(10):3738–3750

    Google Scholar 

  31. Charte F, Rivera AJ, del Jesus MJ, Herrera F (2015) Addressing imbalance in multi-label classification: Measures and random resampling algorithms. Neurocomputing 163:3–16

    Google Scholar 

  32. Charte F, Rivera AJ, del Jesus MJ, Herrera F (2015) MLSMOTE: approaching imbalanced multilabel learning through synthetic instance generation. Knowl-Based Syst 89:385–397

    Google Scholar 

  33. Yu H, Sun C, Yang X, Zheng S, Wang Q, Xi X (2018) LW-ELM: a fast and flexible cost-sensitive learning framework for classifying imbalanced data. IEEE Access 6:28488–28500

    Google Scholar 

  34. Read J, Pfahringer B, Holmes G (2008) Multi-label classification using ensembles of pruned sets. In: Proceedings of the IEEE international conference on data mining, pp 995–1000

  35. Tang L, Rajan S, Narayanan VK (2009) Large scale multi-label classification via MetaLabeler. In: Proceedings of the 2009 international conference on world wide web, pp 211–220

  36. Quevedo J, Luaces OAB (2012) Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recogn 45(2):876–883

    MATH  Google Scholar 

  37. Zhang ML, Li YK, Liu XY (2015) Towards class-imbalance aware multi-label learning. In: Proceedings of international joint conference of artificial intelligence, pp 4041–4047

  38. Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501

    Google Scholar 

  39. Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42(2):513–529

    Google Scholar 

  40. Huang G, Huang GB, Song S, You K (2015) Trends in extreme learning machines: a review. Neural Netw 61(1):32–48

    MATH  Google Scholar 

  41. Deng C, Huang GB, Xu J, Tang J (2015) Extreme learning machines: new trends and applications. Science China Inf Sci 58(2):1–16

    Google Scholar 

  42. Kimura K, Sun L, Kudo M (2017) MLC toolbox: a MATLAB/OCTAVE library for multi-label classification [Online]. https://arxiv.org/abs/1704.02592

  43. Sun X, Xu J, Jiang C, Feng J, Chen SS, He F (2016) Extreme learning machine for multi-label classification. Entropy 18(6): Article.225

  44. Yu H, Sun C, Yang W, Yang X, Zuo X (2015) AL-ELM: one uncertainty-based active learning algorithm using extreme learning machine. Neurocomputing 166:140–150

    Google Scholar 

  45. Eberhart R, Kennedy J (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, pp 1942–1948

  46. Karaboga D, Basturk B (2008) On the performance of artificial bee colony (ABC) algorithm. Appl soft Comput 8(1):687–697

    Google Scholar 

  47. Neshat M, Sepidnam G, Sargolzaei M, Toosi AN (2014) Artificial fish swarm algorithm: a survey of the state-of-the-art, hybridization, combinatorial and indicative applications. Artif Intell Rev 42(4):965–997

    Google Scholar 

  48. Yu H, Ni J, Xu S, Qin B, Ju H (2014) Estimating harmfulness of class imbalance by scatter matrix based class separability measure. Intell Data Anal 18(2):203–216

    Google Scholar 

  49. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  50. Garcia S, Derrac J, Triguero I, Carmona CJ, Herrera F (2012) Evolutionary-based selection of generalized instances for imbalanced classification. Knowl-Based Syst 25:3–12

    Google Scholar 

Download references

Acknowledgements

This work was supported by Natural Science Foundation of Jiangsu Province of China under Grant No. BK20191457, Open Project of Artificial Intelligence Key Laboratory of Sichuan Province under Grant No. 2019RYJ02, National Natural Science Foundation of China under Grant Nos. 61305058 and 61572242, China Postdoctoral Science Foundation under Grant Nos. 2013M540404 and 2015T80481.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hualong Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, S., Dong, W., Cheng, K. et al. Adaptive Decision Threshold-Based Extreme Learning Machine for Classifying Imbalanced Multi-label Data. Neural Process Lett 52, 2151–2173 (2020). https://doi.org/10.1007/s11063-020-10343-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-020-10343-3

Keywords

Navigation