Elsevier

Knowledge-Based Systems

Volume 227, 5 September 2021, 107219
Knowledge-Based Systems

A robust multiobjective Harris’ Hawks Optimization algorithm for the binary classification problem

https://doi.org/10.1016/j.knosys.2021.107219Get rights and content

Highlights

  • We propose a multiobjective Harris’ Hawks Optimization algorithm (MHHO).

  • MHHO minimizes the number of features and maximizes the classification accuracy.

  • New discrete optimization operators are proposed for exploration and exploitation.

  • Experiments are performed on 13 benchmark UCI datasets and a COVID-19 dataset.

  • Solutions of MHHO are competitive with the state-of-the-art metaheuristics.

Abstract

The Harris’ Hawks Optimization (HHO) is a recent metaheuristic inspired by the cooperative behavior of the hawks. These avians apply many intelligent techniques like surprise pounce (seven kills) while they are catching their prey according to the escaping patterns of the target. The HHO simulates these hunting patterns of the hawks to obtain the best/optimal solutions to the problems. In this study, we propose a new multiobjective HHO algorithm for the solution of the well-known binary classification problem. In this multiobjective problem, we reduce the number of selected features and try to keep the accuracy prediction as maximum as possible at the same time. We propose new discrete exploration (perching) and exploitation (besiege) operators for the hunting patterns of the hawks. We calculate the prediction accuracy of the selected features with four machine learning techniques, namely, Logistic Regression, Support Vector Machines, Extreme Learning Machines, and Decision Trees. To verify the performance of the proposed algorithm, we conduct comprehensive experiments on many benchmark datasets retrieved from the University of California, Irvine (UCI) Machine Learning Repository. Moreover, we apply it to a recent real-world dataset, i.e., a Coronavirus disease (COVID-19) dataset. Significant improvements are observed during the comparisons with state-of-the-art metaheuristic algorithms.

Introduction

The amount of available data increases continuously as data storage technologies improve constantly [1]. The high volume of data is valuable for decision-making. However, effective decision-making depends on the quality of information [2]. For this reason, data mining tools are widely used to extract meaningful information from data as it is beyond our manual processing capability [3]. Binary classification is an important technique that helps us process vast data. It analyzes an instance’s attributes and assigns it to one of two classes. It is especially preferred in profiling tasks, e.g. when predicting whether a user buys a specific product; or whether a patient has cancer or not. As of today, however, even these tools may struggle in terms of required computation time and space. Binary classifiers are bloated with a raw, excessive amount of data (attributes), which, in turn, diminishes the performance. This phenomenon is called as the curse of dimensionality [4]. It is known that data mining tools can perform more efficiently after preprocessing is applied to the data [5]. Feature selection is one of the well-known and effective preprocessing techniques. As data sizes grow uncontrollably, feature selection has become an indispensable countermeasure to mitigate the curse of dimensionality. It eliminates irrelevant and redundant features from data which have no contribution to the decision-making process. Formally, feature selection is finding a subset of features with the most informative minimum number of features. Therefore, the main effects of feature selection are as follows: (i) data mining tools run faster as there is less amount of data to compute, (ii) learning performance improves as the noisy data is discarded.

In the literature, feature selection can be applied with three methodologies: filter-based, wrapper-based, and embedded. Filter-based methods, e.g. information gain, are based on a statistical analysis of the attributes. Wrapper-based methods utilize a searching algorithm along with the classifier and test the performance of each subset separately. In embedded methods, the search for the best performing feature subset and classification are handled simultaneously, as a single task. There exists a trade-off between the filter and wrapper-based methods: even though filter-based methods are easier to calculate, wrapper-based methods outperform filter-based methods. Therefore, recent literature on feature selection focuses on wrapper-based methods [6].

Obtaining the optimal set of features is intractable [7], and the feature selection task is considered to be a Non-deterministic Polynomial-time Hard (NP-Hard) problem [8]. Metaheuristic algorithms are the best tools to deal with such problems [9], [10], [11]. Moreover, studies show that metaheuristic algorithms perform better than exhaustive or greedy approaches [12]. State-of-the-art metaheuristic algorithms are highly influenced by nature, and today, they are widely used in the feature selection domain [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25].

Harris’ Hawks Optimization (HHO) metaheuristic is a recently proposed nature-inspired optimization algorithm [26], [27], based on the cooperative hunting behavior of the hawks. Louis Lefebvre et al. proposed an idea to measure the “IQ” level of the birds to observe their behaviors in 1997 [28], [29], [30]. The hawks can be described as among the smartest avians. The Harris’ hawk is a hunter bird that lives in groups and unique with its cooperative foraging behavior while chasing its prey with intelligent techniques. Therefore, they are known as cooperative predators. They start hunting early in the morning and usually perch on tall trees to watch their prey. The members of the group know their moves and react according to these rules. The hawks perform “leapfrog” actions over the target (rabbit), and they cluster and split to search for the rabbit.

According to the No Free Lunch (NFL) Theory [31], newly proposed optimization algorithms can only perform as well as the others when they are applied to all the optimization problems. We cannot say that an algorithm is overall the best one. The NFL supports us to develop more efficient algorithms. Therefore, we intend to analyze the performance of this new metaheuristic on the binary classification problem. The HHO, in its original form, is a single objective optimization algorithm, which is not suitable for the multiobjective nature of the feature selection problem. In this study, we propose a novel multiobjective HHO algorithm for the solution of the binary classification problem. We conduct extensive experiments with the benchmark datasets retrieved from the University of California, Irvine (UCI) Machine Learning Repository, and a recent Coronavirus disease (COVID-19) dataset of 1085 patients. Then, we compare the solution quality of our new algorithm with state-of-the-art metaheuristic algorithms in the literature. The main contributions of the study are as follows:

  • A new multiobjective HHO algorithm is developed for the solution of the well-known binary classification problem.

  • New discrete exploration (perching) and exploitation (besiege) operators are proposed for the hunting patterns of the hawks.

  • The initial population quality of the proposed HHO algorithm is improved with a new and special method.

  • New Pareto-optimal solutions are discovered with smaller number of features and higher prediction accuracy values.

  • Experiments are carried out on well-known UCI benchmark datasets and a recent COVID-19 dataset. The solutions are observed to be better/competitive with state-of-the-art metaheuristics in literature.

The rest of the paper is organized as follows. Section 2 reviews recent studies related to our research. In Section 3, the definition of the problem is given. The proposed algorithm is described in Section 4. The experimental results of the algorithms are discussed in Section 5. Our concluding remarks and future work are presented in Section 6.

Section snippets

Related work

This section gives information about feature selection, multiobjective feature selection, metaheuristic algorithms that are proposed for the solution of the feature selection problem, and studies about Harris’ Hawks Optimization algorithm.

Problem definition

In this section, we give information about the multiobjective binary classification problem. The classification task is one of the most significant problems in knowledge discovery and data mining. Classification-based knowledge is fundamental for decision-making tasks such as diagnosis, recognition of patterns, prediction, etc. In a typical classification task, there exist training data with known categories (labels) and a set of unlabeled data. Data mining tools build models and evaluate their

Proposed multiobjective Harris’ Hawks Optimization algorithm

In this section, we explain our proposed multiobjective HHO feature selection algorithm and the exploration/exploitation operators developed for the proposed algorithm.

The main goal of HHO is to capture prey with the actions of the hawks called surprise pounce (seven kills). The swarm of hawks attacks the prey cooperatively from diverse directions and converge on the target. The hunt may be completed by capturing the prey, or the efforts go on with other dive techniques performed. The Harris’

Experiments and results

In this section, we first describe our experimental setup and introduce the datasets used in the binary classification. Then, we provide the machine learning techniques that we utilized in the experiments. Finally, we discuss the advantages and limitations of the proposed method with empirical results.

Conclusion

In this paper, we propose a robust multiobjective Harris’ Hawks Optimization (MHHO) algorithm for the solution of the well-known binary classification problem. This study is one of the first applications of the HHO metaheuristic to multiobjective discrete optimization domain. The swarm of hawks can cooperatively follow prey from different directions to surprise it, and there are many hunting patterns used by these birds. In our study, six combinatorial optimization operators are proposed for

CRediT authorship contribution statement

Tansel Dokeroglu: Conceptualization, Methodology, Software, Validation, Formal analysis, Writing, Visualization. Ayça Deniz: Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing, Visualization. Hakan Ezgi Kiziloz: Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing, Visualization.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (81)

  • MafarjaM. et al.

    Whale optimization approaches for wrapper feature selection

    Appl. Soft Comput.

    (2018)
  • KashefS. et al.

    An advanced ACO algorithm for feature subset selection

    Neurocomputing

    (2015)
  • ZhangY. et al.

    A return-cost-based binary firefly algorithm for feature selection

    Inform. Sci.

    (2017)
  • HancerE. et al.

    Pareto front feature selection based on artificial bee colony optimization

    Inform. Sci.

    (2018)
  • RostamiM. et al.

    Review of swarm intelligence-based feature selection methods

    Eng. Appl. Artif. Intell.

    (2021)
  • HanF. et al.

    Multi-objective particle swarm optimization with adaptive strategies for feature selection

    Swarm Evol. Comput.

    (2021)
  • ChaudhuriA. et al.

    Feature selection using binary crow search algorithm with time varying flight length

    Expert Syst. Appl.

    (2021)
  • HeidariA.A. et al.

    Harris hawks optimization: Algorithm and applications

    Future Gener. Comput. Syst.

    (2019)
  • LefebvreL. et al.

    Feeding innovations and forebrain size in birds

    Anim. Behav.

    (1997)
  • DashM. et al.

    Feature selection for classification

    Intell. Data Anal.

    (1997)
  • MlakarU. et al.

    Multi-objective differential evolution for feature selection in facial expression recognition systems

    Expert Syst. Appl.

    (2017)
  • AmoozegarM. et al.

    Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism

    Expert Syst. Appl.

    (2018)
  • UnlerA. et al.

    A discrete particle swarm optimization method for feature selection in binary classification problems

    European J. Oper. Res.

    (2010)
  • KizilozH.E. et al.

    Novel multiobjective TLBO algorithms for the feature subset selection problem

    Neurocomputing

    (2018)
  • YustaS.C.

    Different metaheuristic strategies to solve the feature selection problem

    Pattern Recognit. Lett.

    (2009)
  • MafarjaM.M. et al.

    Hybrid whale optimization algorithm with simulated annealing for feature selection

    Neurocomputing

    (2017)
  • ZorarpacıE. et al.

    A hybrid approach of differential evolution and artificial bee colony for feature selection

    Expert Syst. Appl.

    (2016)
  • TaradehM. et al.

    An evolutionary gravitational search-based feature selection

    Inform. Sci.

    (2019)
  • ZhangY. et al.

    Binary differential evolution with self-learning for multi-objective feature selection

    Inform. Sci.

    (2020)
  • ZakeriA. et al.

    Efficient feature selection method using real-valued grasshopper optimization algorithm

    Expert Syst. Appl.

    (2019)
  • TubishatM. et al.

    Improved Salp Swarm Algorithm based on opposition based learning and novel local search algorithm for feature selection

    Expert Syst. Appl.

    (2020)
  • ZhangY. et al.

    Cost-sensitive feature selection using two-archive multi-objective artificial bee colony algorithm

    Expert Syst. Appl.

    (2019)
  • WangH. et al.

    A novel bacterial algorithm with randomness control for feature selection in classification

    Neurocomputing

    (2017)
  • AroraS. et al.

    Binary butterfly optimization approaches for feature selection

    Expert Syst. Appl.

    (2019)
  • MafarjaM. et al.

    Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems

    Knowl.-Based Syst.

    (2018)
  • HancerE. et al.

    Differential evolution for filter feature selection based on information theory and feature ranking

    Knowl.-Based Syst.

    (2018)
  • ZhangY. et al.

    A PSO-based multi-objective multi-label feature selection method in classification

    Sci. Rep.

    (2017)
  • WangX.-h. et al.

    Multi-objective feature selection based on artificial bee colony: An acceleration approach with variable sample size

    Appl. Soft Comput.

    (2020)
  • KizilozH.E.

    Classifier ensemble methods in feature selection

    Neurocomputing

    (2021)
  • ZhangY. et al.

    Boosted binary harris hawks optimizer and feature selection

    Structure

    (2020)
  • Cited by (33)

    • An effective knowledge transfer method based on semi-supervised learning for evolutionary optimization

      2022, Information Sciences
      Citation Excerpt :

      As multi-objective optimization problems (MOPs) [5] have become increasingly common in the fields of science and engineering [6], many evolutionary multi-objective algorithms have been put forward to solve these problems [7–9]. For instance, a number of feature selection and classification tasks can be formulated as MOPs [10–13]. Because of the fact that many of these MOPs should be handled simultaneously, extensive efforts have been made to tackle the multi-objective multitasking optimization (MO-MTO) problems.

    • Lens-imaging learning Harris hawks optimizer for global optimization and its application to feature selection

      2022, Expert Systems with Applications
      Citation Excerpt :

      During the past two years, HHO has been successfully applied to solve various real-world optimization problems with good results (Alabool, Alarabiat, Abualigah, & Heidari, 2021). some examples of these problems are multi-level image thresholding (Abd Elaziz, Heidari, Fujita, & Moayedi, 2020), parameter identification of solar PV models (Chen, Herdari et al., 2020; Chen, Jiao, Wang, Heidari, & Zhao, 2020; Qais, Hasanien, & Alghuwainem, 2020; Ridha, Heidari, Wang, & Chen, 2020), drug design and discovery (Houssein, Hosney, Oliva, Mohamed, & Hassaballah, 2020), image segmentation (Rodríguez-Esparza et al., 2020), wireless sensor network (Ramalingam & Baskaran, 2021), parameter optimization of ANN (Essa, Elaziz, & Elsheikh, 2020), PV array reconfiguration optimization (Yousri, Allam, & Eteiba, 2020), MPPT control for PV systems (Mansoor, Mirza, & Ling, 2020), air pollutants forecasting (Du, Wang, Hao, Niu, & Yang, 2020), optimum power flow (Akdag, Ates, & Yeroglu, 2021), feature selection (Abdel-Basset, Ding, & El-Shahat, 2021; Hussain, Neggaz, Zhu, & Houssein, 2021), fuel cell modeling (Yousri et al., 2021), classification problems (Dokeroglu, Deniz, & Kiziloz, 2021), and dynamic optimization problems (Golcuk & Ozsoydan, 2021). Although the HHO has been widely applied to many real-life applications in different fields, several studies (Abdel-Basset et al., 2021; Wunnava, Naik, Panda, Jena, & Abraham, 2020) have pointed out that its comprehensive performance will be degraded when it is used to solve complex problems with high dimensionality.

    • A comprehensive survey on recent metaheuristics for feature selection

      2022, Neurocomputing
      Citation Excerpt :

      They evaluated the proposed algorithm on various benchmark datasets and compared the results with many well-known metaheuristic algorithms. Dokeroglu et al. proposed a multiobjective HHO algorithm for the solution of the binary classification problem [184]. The authors reduced the number of features and kept the accuracy prediction as maximum as possible.

    View all citing articles on Scopus

    The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.

    View full text