Training error and sensitivity-based ensemble feature selection

Ng, Wing W. Y.; Tuo, Yuxi; Zhang, Jianjun; Kwong, Sam

doi:10.1007/s13042-020-01120-8

Training error and sensitivity-based ensemble feature selection

Original Article
Published: 13 April 2020

Volume 11, pages 2313–2326, (2020)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Wing W. Y. Ng ORCID: orcid.org/0000-0003-0783-3585¹,
Yuxi Tuo¹,
Jianjun Zhang¹ &
…
Sam Kwong²

407 Accesses
15 Citations
Explore all metrics

Abstract

Ensemble feature selection combines feature selection and ensemble learning to improve the generalization capability of ensemble systems. However, current methods minimizing only the training error may not generalize well on future unseen samples. In this paper, we propose a training error and sensitivity-based ensemble feature selection method. The NSGA-III is applied to find optimal feature subsets by minimizing two objective functions of the whole ensemble system simultaneously: the training error and the sensitivity of the ensemble. With this scheme, the ensemble system maintains both high accuracy and high stability which is expected to achieve a high generalization capability. Experimental results on 18 datasets show that the proposed method significantly outperforms state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection techniques for machine learning: a survey of more than two decades of research

Article 01 December 2023

A review of unsupervised feature selection methods

Article 29 January 2019

Hybrid approaches to optimization and machine learning methods: a systematic literature review

Article Open access 24 January 2024

Notes

References

Wang X, Zhang Y, Sun X, Wang Y, Du C (2020) Multi-objective feature selection based on artificial bee colony: an acceleration approach with variable sample size. Appl Soft Comput 88:106041
Article Google Scholar
Nag K, Pal NR (2016) A multiobjective genetic programming-based ensemble for simultaneous feature selection and classification. IEEE Trans Cybernet 46(2):499–510
Article Google Scholar
Pes B, Dessì N, Angioni M (2017) Exploiting the ensemble paradigm for stable feature selection: a case study on high-dimensional genomic data. Inf Fus 35:132–147
Article Google Scholar
Bolón-Canedo V, Alonso-Betanzos A (2019) Ensembles for feature selection: a review and future trends. Inf Fus 52:1–12
Article Google Scholar
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2014) Data classification using an ensemble of filters. Neurocomputing 135:13–20
Article Google Scholar
Diao R, Chao F, Peng T, Snooke N, Shen Q (2014) Feature selection inspired classifier ensemble reduction. IEEE Trans Cybernet 44(8):1259–1268
Article Google Scholar
Yeung DS, Ng WWY, Wang D, Tsang ECC, Wang X (2007) Localized generalization error model and its application to architecture selection for radial basis function neural network. IEEE Trans Neural Netw 18(5):1294–1305
Article Google Scholar
Deb K, Jain H (2014) An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: solving problems with box constraints. IEEE Trans Evol Comput 18(4):577–601
Article Google Scholar
Jain H, Deb K (2014) An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, part II: handling constraints and extending to an adaptive approach. IEEE Trans Evol Comput 18(4):602–622
Article Google Scholar
Wang T, Ng WWY, Pelillo M, Kwong S (2019) LiSSA: localized stochastic sensitive autoencoders. IEEE Trans Cybernet, in press
Yeung DS, Li J, Ng WWY, Chan PPK (2016) MLPNN training via a multiobjective optimization of training error and stochastic sensitivity. IEEE Trans Neural Netw Learn Syst 27(5):978–992
Article MathSciNet Google Scholar
Mirzaei A, Pourahmadi V, Soltani M, Sheikhzadeh H (2019) Deep feature selection using a teacher-student network. In: Neurocomputing, in press
Li Y, Guo H, Liu X, Li Y, Li J (2016) Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl-Based Syst 94:88–104
Article Google Scholar
Liu Z, Li Y, Ji W (2018) Differential private ensemble feature selection. In: 2018 international joint conference on neural networks (IJCNN), Rio de Janeiro, pp 1–6
Dessì N, Pes B (2015) Similarity of feature selection methods: An empirical study across data intensive classification tasks. Expert Syst Appl 42(10):4632–4642
Article Google Scholar
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Article Google Scholar
Opitz DW (1999) Feature selection for ensembles. In: 16th national conference on artificial intelligence (AAAI-99). Orlando, FL, pp 379–384
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422
Article Google Scholar
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2015) Distributed feature selection: an application to microarray data classification. Appl Soft Comput 30:136–150
Article Google Scholar
Seijo-Pardo B, Porto-Díaz I, Bolón-Canedo V, Alonso-Betanzos A (2017) Ensemble feature selection: homogeneous and heterogeneous approaches. Knowl-Based Syst 118:124–139
Article Google Scholar
Ren Y, Zhang L, Suganthan PN (2016) Ensemble classification and regression-recent developments, applications and future directions. IEEE Comput Intell Mag 11(1):41–53
Article Google Scholar
Seijo-Pardo B, Bolón-Canedo V, Alonso-Betanzos A (2019) On developing an automatic threshold applied to feature selection ensembles. Inf Fus 45:227–245
Article Google Scholar
Yu Z, Li L, Liu J, Han G (2015) Hybrid adaptive classifier ensemble. IEEE Trans Cybernet 45(2):177–190
Article Google Scholar
Guan Y, Li C, Roli F (2015) On reducing the effect of covariate Factors in gait recognition: a classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 37(7):1521–1528
Article Google Scholar
Güney H, Öztoprak H (2018) The impact of under-sampling on the performance of bootstrap-based ensemble feature selection. In: 2018 26th signal processing and communications applications conference (SIU). Izmir, Turkey, pp 1–4
Ding Y (2016) Imbalanced network traffic classification based on ensemble feature selection. In: 2016 IEEE international conference on signal processing, communications and computing (ICSPCC). Hong Kong, China, pp 1–4
Das AK, Das S, Ghosh A (2017) Ensemble feature selection using bi-objective genetic algorithm. Knowl-Based Syst 123:116–127
Article Google Scholar
Tan CJ, Lim CP, Cheah YN (2014) A multi-objective evolutionary algorithm-based ensemble optimizer for feature selection and classification with neural network models. Neurocomputing 125:217–228
Article Google Scholar
Drotár P, Gazda M, Vokorokos L (2019) Ensemble feature selection using election methods and ranker clustering. Inf Sci 480:365–380
Article MathSciNet Google Scholar
Liu K, Yang X, Yu H, Mi J, Wang P, Chen X (2019) Rough set based semi-supervised feature selection via ensemble selector. Knowl-Based Syst 165:282–296
Article Google Scholar
Tsymbal A, Pechenizkiy M, Cunningham P (2005) Diversity in search strategies for ensemble feature selection. Inf Fus 6(1):83–98
Article Google Scholar
Chan AP, Chan PP, Ng WW, Tsang EC, Yeung DS (2008) A novel feature grouping method for ensemble neural network using localized generalization error model. Int J Pattern Recognit Artif Intell 22(1):137–151
Article Google Scholar
Saeys Y, Abeel T, Van der Peer Y (2008) Robust feature selection using ensemble feature selection techniques. In: Joint European conference on machine learning and knowledge discovery in databases, pp 313–325
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
Article Google Scholar
Lei Y, Huan L (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res (JMLR) 5:1205–1224
MathSciNet MATH Google Scholar
Quinlan J (1986) Induction of decision trees. Mach Learn 1(1):81–106
Google Scholar
Kononenko I (1994) Estimating attributes: analysis and extensions of relief. In: European conference on machine learning, Springer, Berlin, pp 171–182
Mejía-Lavalle M, Sucar E, Arroyo G (2006) Feature selection with a perceptron neural net. In: Proceedings of the international workshop on feature selection for data mining, pp 131–135
Durillo JJ, Nebro AJ (2011) jMetal: A java framework for multi-objective optimization. Adv Eng Softw 42(10):760–771
Article Google Scholar
Cruz RM, Sabourin R, Cavalcanti GD (2017) META-DES.Oracle: meta-learning and feature selection for dynamic ensemble selection. Inf Fus 38:84–103
Article Google Scholar
Taghavi ZS, Niaki STA, Niknamfar Amir H (2019) Stochastic ensemble pruning method via simulated quenching walking. Int J Mach Learn Cybernet 10:1875–1892
Article Google Scholar
Pérez-Gállego P, Castaño A, Quevedo JR, del Coz JJ (2019) Dynamic ensemble selection for quantification tasks. Inf Fus 45:1–15
Article Google Scholar
Rayal R, Khanna D, Sandhu JK, Hooda N, Rana PS (2019) N-semble: neural network based ensemble approach. Int J Mach Learn Cybernet 10:337–345
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China under Grants 61876066, 61572201 and 61672443, Guangdong Province Science and Technology Plan Project (Collaborative Innovation and Platform Environment Construction) 2019A050510006, Guangzhou Science and Technology Plan Project 201804010245, and Hong Kong RGC General Research Funds under Grant 9042038 (CityU 11205314) and Grant 9042322 (CityU 11200116).

Author information

Authors and Affiliations

Guangdong Provincial Key Lab of Computational Intelligence and Cyberspace Information School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
Wing W. Y. Ng, Yuxi Tuo & Jianjun Zhang
Department of Computer Science, Hong Kong City University, Hong Kong, China
Sam Kwong

Authors

Wing W. Y. Ng
View author publications
You can also search for this author in PubMed Google Scholar
Yuxi Tuo
View author publications
You can also search for this author in PubMed Google Scholar
Jianjun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Sam Kwong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianjun Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ng, W.W.Y., Tuo, Y., Zhang, J. et al. Training error and sensitivity-based ensemble feature selection. Int. J. Mach. Learn. & Cyber. 11, 2313–2326 (2020). https://doi.org/10.1007/s13042-020-01120-8

Download citation

Received: 22 October 2019
Accepted: 22 March 2020
Published: 13 April 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s13042-020-01120-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Training error and sensitivity-based ensemble feature selection

Abstract

Access this article

Similar content being viewed by others

Feature selection techniques for machine learning: a survey of more than two decades of research

A review of unsupervised feature selection methods

Hybrid approaches to optimization and machine learning methods: a systematic literature review

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Training error and sensitivity-based ensemble feature selection

Abstract

Access this article

Similar content being viewed by others

Feature selection techniques for machine learning: a survey of more than two decades of research

A review of unsupervised feature selection methods

Hybrid approaches to optimization and machine learning methods: a systematic literature review

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation