A robust multiobjective Harris’ Hawks Optimization algorithm for the binary classification problem
Introduction
The amount of available data increases continuously as data storage technologies improve constantly [1]. The high volume of data is valuable for decision-making. However, effective decision-making depends on the quality of information [2]. For this reason, data mining tools are widely used to extract meaningful information from data as it is beyond our manual processing capability [3]. Binary classification is an important technique that helps us process vast data. It analyzes an instance’s attributes and assigns it to one of two classes. It is especially preferred in profiling tasks, e.g. when predicting whether a user buys a specific product; or whether a patient has cancer or not. As of today, however, even these tools may struggle in terms of required computation time and space. Binary classifiers are bloated with a raw, excessive amount of data (attributes), which, in turn, diminishes the performance. This phenomenon is called as the curse of dimensionality [4]. It is known that data mining tools can perform more efficiently after preprocessing is applied to the data [5]. Feature selection is one of the well-known and effective preprocessing techniques. As data sizes grow uncontrollably, feature selection has become an indispensable countermeasure to mitigate the curse of dimensionality. It eliminates irrelevant and redundant features from data which have no contribution to the decision-making process. Formally, feature selection is finding a subset of features with the most informative minimum number of features. Therefore, the main effects of feature selection are as follows: (i) data mining tools run faster as there is less amount of data to compute, (ii) learning performance improves as the noisy data is discarded.
In the literature, feature selection can be applied with three methodologies: filter-based, wrapper-based, and embedded. Filter-based methods, e.g. information gain, are based on a statistical analysis of the attributes. Wrapper-based methods utilize a searching algorithm along with the classifier and test the performance of each subset separately. In embedded methods, the search for the best performing feature subset and classification are handled simultaneously, as a single task. There exists a trade-off between the filter and wrapper-based methods: even though filter-based methods are easier to calculate, wrapper-based methods outperform filter-based methods. Therefore, recent literature on feature selection focuses on wrapper-based methods [6].
Obtaining the optimal set of features is intractable [7], and the feature selection task is considered to be a Non-deterministic Polynomial-time Hard (NP-Hard) problem [8]. Metaheuristic algorithms are the best tools to deal with such problems [9], [10], [11]. Moreover, studies show that metaheuristic algorithms perform better than exhaustive or greedy approaches [12]. State-of-the-art metaheuristic algorithms are highly influenced by nature, and today, they are widely used in the feature selection domain [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25].
Harris’ Hawks Optimization (HHO) metaheuristic is a recently proposed nature-inspired optimization algorithm [26], [27], based on the cooperative hunting behavior of the hawks. Louis Lefebvre et al. proposed an idea to measure the “IQ” level of the birds to observe their behaviors in 1997 [28], [29], [30]. The hawks can be described as among the smartest avians. The Harris’ hawk is a hunter bird that lives in groups and unique with its cooperative foraging behavior while chasing its prey with intelligent techniques. Therefore, they are known as cooperative predators. They start hunting early in the morning and usually perch on tall trees to watch their prey. The members of the group know their moves and react according to these rules. The hawks perform “leapfrog” actions over the target (rabbit), and they cluster and split to search for the rabbit.
According to the No Free Lunch (NFL) Theory [31], newly proposed optimization algorithms can only perform as well as the others when they are applied to all the optimization problems. We cannot say that an algorithm is overall the best one. The NFL supports us to develop more efficient algorithms. Therefore, we intend to analyze the performance of this new metaheuristic on the binary classification problem. The HHO, in its original form, is a single objective optimization algorithm, which is not suitable for the multiobjective nature of the feature selection problem. In this study, we propose a novel multiobjective HHO algorithm for the solution of the binary classification problem. We conduct extensive experiments with the benchmark datasets retrieved from the University of California, Irvine (UCI) Machine Learning Repository, and a recent Coronavirus disease (COVID-19) dataset of 1085 patients. Then, we compare the solution quality of our new algorithm with state-of-the-art metaheuristic algorithms in the literature. The main contributions of the study are as follows:
- •
A new multiobjective HHO algorithm is developed for the solution of the well-known binary classification problem.
- •
New discrete exploration (perching) and exploitation (besiege) operators are proposed for the hunting patterns of the hawks.
- •
The initial population quality of the proposed HHO algorithm is improved with a new and special method.
- •
New Pareto-optimal solutions are discovered with smaller number of features and higher prediction accuracy values.
- •
Experiments are carried out on well-known UCI benchmark datasets and a recent COVID-19 dataset. The solutions are observed to be better/competitive with state-of-the-art metaheuristics in literature.
The rest of the paper is organized as follows. Section 2 reviews recent studies related to our research. In Section 3, the definition of the problem is given. The proposed algorithm is described in Section 4. The experimental results of the algorithms are discussed in Section 5. Our concluding remarks and future work are presented in Section 6.
Section snippets
Related work
This section gives information about feature selection, multiobjective feature selection, metaheuristic algorithms that are proposed for the solution of the feature selection problem, and studies about Harris’ Hawks Optimization algorithm.
Problem definition
In this section, we give information about the multiobjective binary classification problem. The classification task is one of the most significant problems in knowledge discovery and data mining. Classification-based knowledge is fundamental for decision-making tasks such as diagnosis, recognition of patterns, prediction, etc. In a typical classification task, there exist training data with known categories (labels) and a set of unlabeled data. Data mining tools build models and evaluate their
Proposed multiobjective Harris’ Hawks Optimization algorithm
In this section, we explain our proposed multiobjective HHO feature selection algorithm and the exploration/exploitation operators developed for the proposed algorithm.
The main goal of HHO is to capture prey with the actions of the hawks called surprise pounce (seven kills). The swarm of hawks attacks the prey cooperatively from diverse directions and converge on the target. The hunt may be completed by capturing the prey, or the efforts go on with other dive techniques performed. The Harris’
Experiments and results
In this section, we first describe our experimental setup and introduce the datasets used in the binary classification. Then, we provide the machine learning techniques that we utilized in the experiments. Finally, we discuss the advantages and limitations of the proposed method with empirical results.
Conclusion
In this paper, we propose a robust multiobjective Harris’ Hawks Optimization (MHHO) algorithm for the solution of the well-known binary classification problem. This study is one of the first applications of the HHO metaheuristic to multiobjective discrete optimization domain. The swarm of hawks can cooperatively follow prey from different directions to surprise it, and there are many hunting patterns used by these birds. In our study, six combinatorial optimization operators are proposed for
CRediT authorship contribution statement
Tansel Dokeroglu: Conceptualization, Methodology, Software, Validation, Formal analysis, Writing, Visualization. Ayça Deniz: Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing, Visualization. Hakan Ezgi Kiziloz: Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing, Visualization.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (81)
- et al.
A survey on feature selection methods
Comput. Electr. Eng.
(2014) - et al.
Wrappers for feature subset selection
Artificial Intelligence
(1997) - et al.
A survey on feature selection
Procedia Comput. Sci.
(2016) - et al.
A survey on new generation metaheuristic algorithms
Comput. Ind. Eng.
(2019) - et al.
A survey on optimization metaheuristics
Inform. Sci.
(2013) - et al.
Robust multiobjective evolutionary feature subset selection algorithm for binary classification using machine learning techniques
Neurocomputing
(2017) - et al.
Binary dragonfly optimization for feature selection using time-varying transfer functions
Knowl.-Based Syst.
(2018) - et al.
Binary grasshopper optimisation algorithm approaches for feature selection problems
Expert Syst. Appl.
(2019) - et al.
Binary ant lion approaches for feature selection
Neurocomputing
(2016) - et al.
Binary grey wolf optimization approaches for feature selection
Neurocomputing
(2016)
Whale optimization approaches for wrapper feature selection
Appl. Soft Comput.
An advanced ACO algorithm for feature subset selection
Neurocomputing
A return-cost-based binary firefly algorithm for feature selection
Inform. Sci.
Pareto front feature selection based on artificial bee colony optimization
Inform. Sci.
Review of swarm intelligence-based feature selection methods
Eng. Appl. Artif. Intell.
Multi-objective particle swarm optimization with adaptive strategies for feature selection
Swarm Evol. Comput.
Feature selection using binary crow search algorithm with time varying flight length
Expert Syst. Appl.
Harris hawks optimization: Algorithm and applications
Future Gener. Comput. Syst.
Feeding innovations and forebrain size in birds
Anim. Behav.
Feature selection for classification
Intell. Data Anal.
Multi-objective differential evolution for feature selection in facial expression recognition systems
Expert Syst. Appl.
Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism
Expert Syst. Appl.
A discrete particle swarm optimization method for feature selection in binary classification problems
European J. Oper. Res.
Novel multiobjective TLBO algorithms for the feature subset selection problem
Neurocomputing
Different metaheuristic strategies to solve the feature selection problem
Pattern Recognit. Lett.
Hybrid whale optimization algorithm with simulated annealing for feature selection
Neurocomputing
A hybrid approach of differential evolution and artificial bee colony for feature selection
Expert Syst. Appl.
An evolutionary gravitational search-based feature selection
Inform. Sci.
Binary differential evolution with self-learning for multi-objective feature selection
Inform. Sci.
Efficient feature selection method using real-valued grasshopper optimization algorithm
Expert Syst. Appl.
Improved Salp Swarm Algorithm based on opposition based learning and novel local search algorithm for feature selection
Expert Syst. Appl.
Cost-sensitive feature selection using two-archive multi-objective artificial bee colony algorithm
Expert Syst. Appl.
A novel bacterial algorithm with randomness control for feature selection in classification
Neurocomputing
Binary butterfly optimization approaches for feature selection
Expert Syst. Appl.
Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems
Knowl.-Based Syst.
Differential evolution for filter feature selection based on information theory and feature ranking
Knowl.-Based Syst.
A PSO-based multi-objective multi-label feature selection method in classification
Sci. Rep.
Multi-objective feature selection based on artificial bee colony: An acceleration approach with variable sample size
Appl. Soft Comput.
Classifier ensemble methods in feature selection
Neurocomputing
Boosted binary harris hawks optimizer and feature selection
Structure
Cited by (33)
A novel study on forecasting the airfoil self-noise, using a hybrid model based on the combination of CatBoost and Arithmetic Optimization Algorithm
2023, Expert Systems with ApplicationsA feature selection framework for anxiety disorder analysis using a novel multiview harris hawk optimization algorithm
2023, Artificial Intelligence in MedicinePredicting the carbon dioxide emission caused by road transport using a Random Forest (RF) model combined by Meta-Heuristic Algorithms
2023, Sustainable Cities and SocietyAn effective knowledge transfer method based on semi-supervised learning for evolutionary optimization
2022, Information SciencesCitation Excerpt :As multi-objective optimization problems (MOPs) [5] have become increasingly common in the fields of science and engineering [6], many evolutionary multi-objective algorithms have been put forward to solve these problems [7–9]. For instance, a number of feature selection and classification tasks can be formulated as MOPs [10–13]. Because of the fact that many of these MOPs should be handled simultaneously, extensive efforts have been made to tackle the multi-objective multitasking optimization (MO-MTO) problems.
Lens-imaging learning Harris hawks optimizer for global optimization and its application to feature selection
2022, Expert Systems with ApplicationsCitation Excerpt :During the past two years, HHO has been successfully applied to solve various real-world optimization problems with good results (Alabool, Alarabiat, Abualigah, & Heidari, 2021). some examples of these problems are multi-level image thresholding (Abd Elaziz, Heidari, Fujita, & Moayedi, 2020), parameter identification of solar PV models (Chen, Herdari et al., 2020; Chen, Jiao, Wang, Heidari, & Zhao, 2020; Qais, Hasanien, & Alghuwainem, 2020; Ridha, Heidari, Wang, & Chen, 2020), drug design and discovery (Houssein, Hosney, Oliva, Mohamed, & Hassaballah, 2020), image segmentation (Rodríguez-Esparza et al., 2020), wireless sensor network (Ramalingam & Baskaran, 2021), parameter optimization of ANN (Essa, Elaziz, & Elsheikh, 2020), PV array reconfiguration optimization (Yousri, Allam, & Eteiba, 2020), MPPT control for PV systems (Mansoor, Mirza, & Ling, 2020), air pollutants forecasting (Du, Wang, Hao, Niu, & Yang, 2020), optimum power flow (Akdag, Ates, & Yeroglu, 2021), feature selection (Abdel-Basset, Ding, & El-Shahat, 2021; Hussain, Neggaz, Zhu, & Houssein, 2021), fuel cell modeling (Yousri et al., 2021), classification problems (Dokeroglu, Deniz, & Kiziloz, 2021), and dynamic optimization problems (Golcuk & Ozsoydan, 2021). Although the HHO has been widely applied to many real-life applications in different fields, several studies (Abdel-Basset et al., 2021; Wunnava, Naik, Panda, Jena, & Abraham, 2020) have pointed out that its comprehensive performance will be degraded when it is used to solve complex problems with high dimensionality.
A comprehensive survey on recent metaheuristics for feature selection
2022, NeurocomputingCitation Excerpt :They evaluated the proposed algorithm on various benchmark datasets and compared the results with many well-known metaheuristic algorithms. Dokeroglu et al. proposed a multiobjective HHO algorithm for the solution of the binary classification problem [184]. The authors reduced the number of features and kept the accuracy prediction as maximum as possible.
The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.