Next Article in Journal
A Dynamic Model of Cytosolic Calcium Concentration Oscillations in Mast Cells
Next Article in Special Issue
Transfer Learning Analysis of Multi-Class Classification for Landscape-Aware Algorithm Selection
Previous Article in Journal
GDP and Public Expenditure in Education, Health, and Defense. Empirical Research for Greece
Previous Article in Special Issue
A Method for Prediction of Waterlogging Economic Losses in a Subway Station Project
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Boosting Arithmetic Optimization Algorithm with Genetic Algorithm Operators for Feature Selection: Case Study on Cox Proportional Hazards Model

by
Ahmed A. Ewees
1,*,
Mohammed A. A. Al-qaness
2,
Laith Abualigah
3,4,
Diego Oliva
5,
Zakariya Yahya Algamal
6,
Ahmed M. Anter
7,
Rehab Ali Ibrahim
8,
Rania M. Ghoniem
9,10,* and
Mohamed Abd Elaziz
8,11,12
1
Department of Computer, Damietta University, Damietta 34517, Egypt
2
State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
3
Faculty of Computer Sciences and Informatics, Amman Arab University, Amman 11953, Jordan
4
School of Computer Sciences, Universiti Sains Malaysia, Gelugor 11800, Malaysia
5
Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Department of Computer Sciences, Universidad de Guadalajara, Guadalajara 44430, Mexico
6
Department of Statistics and Informatics, University of Mosul, Mosul 41002, Iraq
7
Faculty of Computers and Artificial Intelligence, Beni-Suef University, Beni Suef 62511, Egypt
8
Department of Mathematics, Faculty of Science, Zagazig University, Zagazig 44519, Egypt
9
Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 84428, Saudi Arabia
10
Department of Computer, Mansoura University, Mansoura 35516, Egypt
11
Artificial Intelligence Research Center (AIRC), Ajman University, Ajman P.O. Box 346, United Arab Emirates
12
School of Computer Science and Robotics, Tomsk Polytechnic University, 634050 Tomsk, Russia
*
Authors to whom correspondence should be addressed.
Mathematics 2021, 9(18), 2321; https://doi.org/10.3390/math9182321
Submission received: 22 August 2021 / Revised: 11 September 2021 / Accepted: 15 September 2021 / Published: 19 September 2021
(This article belongs to the Special Issue Evolutionary Algorithms in Artificial Intelligent Systems)

Abstract

:
Feature selection is a well-known prepossessing procedure, and it is considered a challenging problem in many domains, such as data mining, text mining, medicine, biology, public health, image processing, data clustering, and others. This paper proposes a novel feature selection method, called AOAGA, using an improved metaheuristic optimization method that combines the conventional Arithmetic Optimization Algorithm (AOA) with the Genetic Algorithm (GA) operators. The AOA is a recently proposed optimizer; it has been employed to solve several benchmark and engineering problems and has shown a promising performance. The main aim behind the modification of the AOA is to enhance its search strategies. The conventional version suffers from weaknesses, the local search strategy, and the trade-off between the search strategies. Therefore, the operators of the GA can overcome the shortcomings of the conventional AOA. The proposed AOAGA was evaluated with several well-known benchmark datasets, using several standard evaluation criteria, namely accuracy, number of selected features, and fitness function. Finally, the results were compared with the state-of-the-art techniques to prove the performance of the proposed AOAGA method. Moreover, to further assess the performance of the proposed AOAGA method, two real-world problems containing gene datasets were used. The findings of this paper illustrated that the proposed AOAGA method finds new best solutions for several test cases, and it got promising results compared to other comparative methods published in the literature.

1. Introduction

Datasets of broad sizes exist in several real data applications such as pattern recognition, data mining, signal processing, machine learning, text processing, image processing, and web content classification [1,2,3]. These datasets typically contain a significant range of hard-to-cope features. As a result, the quality of these applications’ performance is often decreased by redundant, noisy, and meaningless data [4,5,6].
Researchers used dimensional reduction to eliminate the unimportant and redundant data, which maps the original high-dimensional space data into new lower-dimensional space data [7,8]. Reducing dimensional also allows to imagine and reflect the data and can increase the application’s output [9]. Feature selection is one of the most common strategies used in the dimensional reduction domain to solve the dimensional problem. The goal of handling features is to represent, with high precision, the original features in a specific problem domain by an optimal subset of new selected features [10]. It is possible to execute the feature selection process in the backward or forward direction. The backward selection strategy collects all characteristics; it then eliminates one attribute at each step (whose removal reduces the error the most). This method is replicated until some further elimination increases the fault [11,12]. The forward selection strategy starts with a blank set; it adds one element at each stage that reduces the error the most before another addition does not mainly reduce it [13,14].
There are two types of feature selections, filter and wrapper methods in general; the critical distinction between them is in the technique of choosing the subset of new features [15]. To test the feature subset, the wrapper approaches used the learning methodology. However, it is not feasible to use the wrapper to work with a high-dimensional dataset. This is because the related features require significant time to be decided. However, unlike wrapper approaches, the filter strategy does not use learning techniques to select the features. For these purposes, the wrapper is costly for computing, so it can not be extended to large-size files. In comparison, filter algorithms are also less expensive in terms of computing [16,17].
The feature cost, which means the cost of obtaining a feature attribute, is a particular case in machine learning and data mining with different cost types. It can be portrayed in different ways, such as income, time, pain, and measurement cost, to say a few [18,19,20]. In medical diagnosis, it is usually inexpensive and painless to obtain the values of symptom characteristics detected through the eyes or even cost less. However, getting the importance of other diagnostic characteristics often poses varying costs and risks due to the need to perform several clinical examinations. These expenses are either resources or time for test results or the patient’s medical and psychological pressures. Improving the diagnostic effect by choosing many critical characteristics is essential for this problem. However, it is also necessary to increase the comfort level by choosing cost-less features or saving money. In this case, before settling on the selected diagnostic features, a doctor needs to estimate the trade-off between the diagnostic impact and the cost. In real-world implementations, there are several related examples. However, most of the conventional feature selection approaches neglect the question of the expense of the function [21,22].
When all possible subsets of the dataset are removed during the generation process, there is very high complexity and a high processing time of x 2 , where x is the number of characteristics in the dataset [23]. Therefore, physicists have sought to formulate approaches to solve the feature selection (FS) problem and provide solutions more efficiently than conventional techniques. This problem is considered one of the most recent problems faced by the new technology due to the size of the available information and data [24,25,26]. The use of metaheuristic algorithms is one such approach. On many topics in artificial intelligence, metaheuristic algorithms have been applied and have led to many solutions [27,28,29]. To solve feature selection problems, metaheuristic algorithms are now widely used [30]. According to parameters governed by the way these algorithms operate, they generate subsets randomly. It has been shown that they can help minimize execution time and produce specific outcomes. Grey Wolf Optimizer (GWO) [31], Whale Optimization Algorithm [32], Monarch Butterfly Algorithm [23] Coyote Optimization Algorithm [33], Genetic Algorithm [34], Krill Herd Algorithm [35] Harmony Search [36], Aquila Optimizer [37], Particle Swarm Algorithm [38], and Parallel Membrane-inspired Framework [39] are examples of the metaheuristics that have been used to address feature selection problems.
In the literature, several techniques have been published [40,41,42,43,44], such as, in this paper, the enhancement is carried out by integrating the opposition-based learning methodology, and differential evolution with the Moth-flame Optimization (MFO) [15]. To maximize the integration of the MFO, opposition-based learning is used to produce an optimum initial population; meanwhile, the differential evolution is applied to boost the MFO’s ability to manipulate. Therefore, unlike the conventional MFO algorithm, the suggested approach noted as OMFODE avoided getting trapped in an optimal local value and increase the rapid convergence. This paper proposes a hybrid solution that incorporates two search methods: GWO and Particle Swarm Optimization (PSO) [45]. The GWO is stimulated by the leadership hierarchy and the gray wolves’ hunting actions in nature, with gray wolves choosing to live in a pack. The goal of this hybridization is to combine exploitation and exploration in a balanced manner.
In the paper [23], the novel monarch butterfly optimization (MBO) algorithm is implemented with a wrapper feature selection approach that uses the classifier k-nearest neighbor (KNN). On eighteen benchmark datasets, tests are introduced. The results showed that MBO was superior to four optimization algorithms, providing a high accuracy rate in classification. For feature selection challenges of medical diagnosis and other problems in this paper, a hybrid crow search algorithm is developed and integrated with chaos theory and fuzzy c-means technique designated as CFCSA [46]. The crow search algorithm adopts the global optimization methodology in the recommended CFCSA context to prevent local optimization’s sensitivity. As a cost attribute for the messy crow search algorithm, the fuzzy c-means (FCM) target function is used. Like other optimization algorithms, the Salp Swarm Algorithm (SSA) suffers from population diversity and crashes into the local optimum. This research provides an improved SSA variant known as the Dynamic Salp swarm algorithm to solve these problems (DSSA) [47]. To fix its challenges, two significant changes were included in the SSA. The first upgrade entails creating a new equation for updating the location of salps. By using Singer’s chaotic map, the use of this new equation is regulated. The first change aims to increase the diversity of SSA solutions. The second enhancement entails creating a new local search algorithm (LSA) to increase the exploitation of SSA.
As discussed beforehand, optimization algorithms have shown promising effects when used to address the feature selection problems in recent decades. However, considering increasing research in this direction, whether we need further optimization approaches to find more enhanced outcomes, a fundamental question still emerges. In this regard, these newly introduced metaheuristic algorithms, derived from arithmetic operators, biological evolution, swarm behavior, physical concepts, and mathematical laws, have been increasingly investigated. However, researchers claim that these approaches frequently work ineffectively when there is a substantial increase in complexity and problem dimensionality. This research has two primary motivations: (A) No-Free-Lunch (NFL), which states that there is no optimization technique to solve all optimization problems, so the optimizer’s outstanding success on a specific group of problems does not guarantee another group of problems perform equally effectively. This has inspired many scientists in this area to apply the current approaches to new problem groups. The same is the basis and inspiration for this research. We suggest a novel optimization method by integrating the Arithmetic Optimization Algorithm (AOA) and the operators (Crossover and Mutation) of the Genetic Algorithm to solve the feature selection problem with higher dimensionality. This problem can be categorized as hard, and it can not be solved easily by a traditional technique. So, it needs an advanced and improved method to find the optimal solution for the used cases in this paper. (B) To the best of the developers’ understanding, the proposed method is used for the first time to solve the feature selection problems. The proposed method tackled the conventional AOA’s main weaknesses by avoiding the search strategies’ local search problem and search balancing. As the optimization methods are the best choice to deal with such a complicated problem, we use the proposed method according to its previous performance with some improvements to efficiently tackle the feature selection problem and find a new best solution. Twenty feature selection datasets are used to prove the proposed method’s performance, and the results are compared with other state-of-the-art methods using the standard evaluation criteria. The results showed that the proposed method’s ability is promising in solving the high-dimensional feature selection problems compared to other well-known methods.
The main contributions invented in this paper are given as follows.
  • A modified approach of the classical AOA and GA is proposed that further enhances the exploration and convergence characteristics of this evolutionary-based wrapper feature selection method through the diverse population design.
  • Boosted mutation and crossover operators are introduced for search-based exploration and exploitation of the search.
  • The GA operator’s inclusion promotes the convergence rate to balance the exploration and exploitation characteristic of the proposed approach.
  • Decrease of the feature input set using the proposed search method for high dimensional problems is conducive to develop a high-performing decision method.
  • Comparing the proposed method with several state-of-the-art methods on twenty datasets is conducted.
The design of the rest of this paper is as follows. Section 2 shows the general methods and the proposed improved algorithm. Then, Section 3 presents the experiments and the discussion of the results. Finally, Section 4 gives a conclusion of this paper and potential future directions works.

2. Methods

2.1. Problem Formulation of FS

In this section, the mathematical formulation of FS is introduced. In general, the classification (i.e., supervised learning) of any datasets which has size N S × N F where N S is the number of samples and N F stand for the number of features. The main objective of FS problem is to select a subset of features S from total number of features ( N F ) where the size of S is less than N F . This can be achieved by minimizing the following objective function:
F i t = λ × γ S + ( 1 λ ) × ( | S | N F )
where γ S refers to the classification error using S and | S | are the number of selected features. λ is used to balance between ( | S | N F ) and γ S .

2.2. Arithmetic Optimization Algorithm (AOA)

The preliminaries of the AOA [48] are described in this section. Generally, like other MH algorithms, the AOA has two search phases, exploration, and exploitation, inspired by mathematics operations, such as , + , , and /. First, the AOA generates a set of N solutions (agents). Each one represents a solution for a tested problem. Thus, solutions or agents represent X population, as:
X = x 1 , 1 x 1 , j x 1 , n 1 x 1 , n x 2 , 1 x 2 , j x 2 , n x N 1 , 1 x N 1 , j x N 1 , n x N , 1 x N , j x N , n 1 x N , n
Next, the fitness function of each solution is computed to detect the best one X b . Then, depending on the Math Optimizer Accelerated ( M O A ) value, AOA performs exploration or exploitation processes. Then, M O A is updated as the following equation:
M O A ( t ) = M i n + t × M a x M O A M i n M O A M t
in which M t represents the total number of iterations. M i n M O A and M a x M O A represent the minimum and maximum values of the accelerated function, respectively. More so, the multiplication (M) and division (D) are employed in the exploration phase of the AOA, as presented in the following equation:
X i , j ( t + 1 ) = X b j ÷ ( M O P + ϵ ) × ( ( U B j L B j ) × μ + L B j ) , r 2 < 0.5 X b j × M O P × ( ( U B j L B j ) × μ + L B j ) , o t h e r w i s e
in which ϵ represents a small integer value, U B j and L B j are the lower and upper boundaries of the search domain at jth dimension. μ = 0.5 represents the control function. Moreover, Math Optimizer ( M O P ) can be described as:
M O P ( t ) = 1 t 1 / α M t 1 / α
α = 5 represents the dynamic parameter that determines the precision of the exploitation phase throughout iterations.
Furthermore, addition operators (A) and subtracting (D) operators are used to implement the AOA exploitation phase, using the following equation.
x i , j ( t + 1 ) = X b j M O P × ( ( U B j L B j ) × μ + L B j ) , r 3 < 0.5 X b j + M O P × ( ( U B j L B j ) × μ + L B j ) , o t h e r w i s e
In which r 3 represents a random number generated in [0,1]. After that, the agents’ updating process is implemented using the AOA operators. To sum up, Algorithm 1 illustrates the main steps of the AOA.
Algorithm 1 Steps of AOA
 1:
Input: The parameters of AOA such as dynamic exploitation parameter ( α ), control function ( μ ), number of agents (N) and total number of iterations M t .
 2:
Construct the initial value for the agents X i i = 1 , . . . , N .
 3:
while ( t < M t ) do
 4:
 Compute the fitness function for each agent.
 5:
 Determine the best agent X b .
 6:
 Update the M O A and M O P using Equation (3) and Equation (5), respectively.
 7:
for i = 1 to N do
 8:
  for j = 1 to D i m  do
 9:
   Update the value of r 1 , r 2 , and r 3 .
10:
   if r 1 > M O A then
11:
    Exploration phase
12:
    Use Equation (4) to update the X i .
13:
   else
14:
    Exploitation phase
15:
    Use Equation (6) to update the X i .
16:
   end if
17:
  end for
18:
end for
19:
 t = t + 1
20:
end while
21:
Output the best agent (feature subset) ( X b ).

2.3. Genetic Algorithm

In this section, the basic information of Genetic algorithms (GA) is introduced [49]. In general, GA is a population-based meta-heuristic technique, and each individual inside the population represents a feasible solution. There are three stages in GA used to update the individuals, namely Selection, Crossover, and Mutation process. In the Selection process, two individuals are selected randomly, which leads to enhancing the population’s diversity. Then the crossover process generates new individuals from the selected individuals (parents) by exchanging their values. After that, the mutation is applied to replace a randomly selected individual with a random value belonging to the search space. Finally, according to the fitness value of newly generated individuals and their parents, the current population is updated by selected the best individuals to form the new population. Then, updating the population using the three processes of GA (i.e., selection, crossover, and mutation) is repeated until reached the stop conditions.

2.3.1. Crossover

One of basic operators in GA is the crossover, in the related literature they are different modification of it. The most simple crossover is the single point method. Here are necessary two parents that are randomly selected from the population. The parents are used to generate an offspring by sing a single point that divides the information contained in them. By using a single points the values after it are interchanged between the two parents and new solutions are created. The Figure 1 graphically shows how the single point crossover works.
The single point crossover is a good alternative, but for real code purposes it is better to employ another version. The blend crossover also known as BLX- α is a real coded operator. Similar to the single point, it is necessary to take two parents x 1 and x 2 from the population. By using the parents it is extracted a portion x i c from bot of them. Equation (7) provides a better explanation of BLX- α .
X i 1 = min x i 1 , x i 2 α d i X i 2 = max x i 1 , x i 2 α d i , d i = x i 1 x i 2
where, x i 1 and x i 2 are elements taken from x 1 and x 2 and α is positive value setting to 0.5 according to [50].

2.3.2. Mutation

The mutation is an operator that helps to explore around of an specific solution. Similar to the crossover they are several ways to perform the mutation. However, in this article it is considered the Gaussian mutation that was introduced by Higashi and Iba [51]. In this kind of mutation it is necessary to take an element form the population and it is modified by using a random number created by a Gaussian distribution. The modified solution is a mutated individual and it is computed as follows:
m u t a t e ( x i d ) = x i d × ( 1 + g a u s s i a n ( σ ) )
From Equation (8) x i d is the selected individual from the population, Gaussian ( σ ) is a random number generator that uses a Gaussian distribution with a standard deviation of σ = 0.1 .

2.3.3. Selection

The selection operator is also important because it helps to extract the elements of the population that will be manipulated by the crossover and mutation. Here is also possible to find different mechanisms but the most common is the roulette wheel [52]. This method is based on the fitness and it works by assigning a probability p s to a each member of the population. The population then is segmented into different regions represented by the individuals. In a population of n candidate solutions defined as P = a 1 , a 2 , , a n the element a i possesses a fitness value f ( a i ) , then the probability of a i to be selected is computed as:
p s ( a i ) = f ( a i ) i = 0 n f ( a j ) , j = 1 , 2 , . . . , n

3. Proposed AOAGA Feature Selection

Optimization techniques, as mentioned above, have been successfully used in many research fields to solve various complicated problems. In this section, the proposed optimization method is presented to solve the feature selection problems. This problem is a widespread complex issue that has appeared in many knowledge-based approaches, and it needs an efficient method to solve. It is typically based on selecting the optimized features from a massive amount of features to reduce the computational time and increase the performance of the underlying system analysis.
Figure 2 depicts the structure of the developed feature selection method. This method depends on enhancing the performance of AOA to find the optimal subset of relevant features using the operators of the genetic algorithm (GA).
The developed FS method is called AOAGA. The main difference between AOAGA and the original AOA is that the exploration phase of the proposed AOAGA is improved and can explore more regions in the search domain than the original version of the AOA, and it also can escape from getting stuck in local optima due to operators of the GA.
The AOAGA starts by setting the initial population U, which has N number of agents; this formulated using the following:
U i = L B i + α i × ( U B i L B i ) .
In Equation (10), α i [ 0 , 1 ] is a random value. The U B i = 1 and L B i = 0 are limits the search domain. The next step in the developed AOAGA is to assess the quality of the selected feature. This is achieved by converting each agent into the binary form using the following equation.
B U i j = 1 i f U i j > 0.5 0 o t h e r w i s e
Thereafter, computing the classification error after removing the irrelevant features that corresponding to zeros in B U . This is performed by using Equation (12).
F i t i = λ × γ i + ( 1 λ ) × ( | B U i | N F )
In Equation (12), λ [ 0 , 1 ] refers to the weight applied to balance between the two sides of Equation (12). N F refers to the number of features, and | B U i | is the number of selected features corresponding to ones inside U i . γ i is the classification error using the features in U i and is computed based on the KNN classifier. In this study, KNN is learned using a training set representing 80% while the rest dataset is used as a testing set (20%) to evaluate the learned KNN.
Thereafter, the best agent U b is determined and used to update the other agents with the operators of GA and AOA. This updating process is performed using Equation (13).
U i = O p e r a t o r s o f G A i f t < 0.10 × t m a x O p e r a t o r s o f A O A o t h e r w i s e
The next step is to check the stop conditions if they are not met and then repeat the updating process. Otherwise, the best agent is returned and updated the testing set according to it and evaluates the classification’s quality using the updated testing set. The flowchart of the developed AOAGA is given in Figure 2.
The selected primary references in this paper are chosen according to their importance and results in this field. We focused on the most related research in this field to support our research and get significant results and descriptions. However, the main limitations of the proposed method in this paper are selecting real-word feature selection problems for other medical purposes and compared with other advanced methods published in the literature in this domain. This process can further prove the ability of the proposed method to solve various feature selection problems.
The complexity of the developed AOAGA depends on some parameters such as number of agents N, total number of iterations M t , and the dimension of the tested problem n. So, the complexity of AOAGA in terms of Big O can be formulated as:
O ( A O A G A ) = O ( M t 1 × N × N F ) + O ( M t 2 ( N × N F + N × N F + N ) )
Since M t 2 = M M t 1 , so we can rewrite Equation (14) as:
O ( A O A G A ) = O ( M t 1 × N × N F ) + O ( ( M M t 1 ) ( 2 N × N F + N ) )
O ( A O A G A ) = O ( N × ( M ( 2 N F + 1 ) M t 1 ( N F + 1 ) )
where M t 1 stand for the number of iterations used to update solutions using operators of GA.

4. Experimental Results and Discussion

In this study, the developed AOAGA to improve the performance of classification data is evaluated by removing the irrelevant features. This was achieved using twenty UCI machine learning repository datasets [53] and real-world datasets from [54,55].
The developed AOAGA is compared with ten metaheuristics techniques including Slime mould algorithm (SMA) [56], Harris hawks optimization (HHO) [57], GA [58], Multi-verse optimizer (MVO) [59], SSA [60], MFO, Grasshopper optimization algorithm (GOA) [61], PSO, and GWO [62]. Each algorithm is used with the same parameters used in its original implementation.
These algorithms are run on an 8 GB RAM Intel Core i5 processor using Matlab 2014b. The population is set to 25 whereas, the max number of iterations is 100. Thirteen independent runs are produced for each algorithm.

4.1. Performance Measures

To validate the performance of developed AOAGA, a set of evaluation metrics is used. For example, accuracy, number of selected features, the average and standard deviation of fitness value [63,64,65]. The definition of each measure is given as:
  • Average of accuracy ( A v g a c c ) is used to compute the ability of an algorithm to predict the correct label of each class over the runs. Higher value is better [65]. It is defined as:
    A v g a c c = 1 N r k = 1 N r A c c B e s t k , A c c B e s t k = T P + T N T P + F N + F P + T N
  • Standard deviation (STD) is used to check to what extent an algorithm can obtain the same results over different runs. Smaller value is better [65]. It is formulated as:
    S T D Y = 1 N r k = 1 N r A c c B e s t k A v g a c c 2
  • Average of selected features ( A V G | B X B e s t | ) is applied to test an algorithm’s ability to choose the smallest subset of relevant features overall runs. Smaller value is better [65]. It is given as:
    A V G B X B e s t = 1 N r k = 1 N r B X B e s t k
    where | . | denotes the cardinality of B X B e s t k at k-th run.
  • Average of fitness value ( A V G F i t ) evaluates the algorithm ability to balance the lower error and ratio of selected features. Smaller value is better [65]. It is formulated as:
    A V G F i t = 1 N r k = 1 N r F i t B e s t k
    where T P and T N refer to the true positive and true negative. Whereas F N and F P are the false positive and false negative, respectively [66].

4.2. Experimental Series 1: UCI Datasets Results and Discussion

Within this experiment, a set of UCI datasets are used. These datasets are collected from different fields such as Biology, Game, Electromagnetic, Politics, Physics, and Chemistry. In addition, each of them has a different number of samples, features, and classes. The description of each dataset is given in Table 1.

4.2.1. Results and Discussion of UCI Dataset

This subsection presents and discusses the experimental results and comparisons obtained in solving the problem of feature selection. The comparisons used the standard AOA, SMA, HHO, GA, MVO, SSA, MFO, GOA, PSO, and GWO with the metrics described in the previous sections. In Table 2 they have presented the experimental results by using the mean of fitness function for all the compared methods over the 20 data sets. From this table, it is possible to see that the proposed AOAGA is superior in 16 of the 20 experiments; meanwhile, the PSO obtains the best results in 4 cases and the GWO only in 1. These results represent that the AOAGA is superior and accurate regarding the fitness value for feature selection. In the tables, the boldface indicates the best value.
Continuing with the fitness value, it is possible to analyze the minimum (MIN) and the maximum (MAX) value of the fitness functions. This study then permits us to know when the algorithms get the best and worst value. Table 3 shows the min fitness values obtained for the selected algorithms in all the datasets. The AOAGA, PSO, and GWO have lower values in most cases (13 of the 20 datasets), which occurs because these algorithms can also produce the optimal values.
On the other hand, Table 4 shows the MAX values of the fitness value obtained after the experiments of selected algorithms over the 20 datasets from the UCI. The AOAGA is the method that provides the MAX value for all the cases and the PSO only for two cases. The rest of the algorithms cannot get any maximal from the experiments.
The stability of the results computed by the algorithms is analyzed by using the standard deviation (STD). The STD is calculated after all the independent experiments were performed for each dataset using the fitness value as input. In this case, the experimental runs are set to 30. A lower STD represents better stability of the results. In other words, no substantial changes in the experiments. Table 5 presents the values of the STD, where the AOAGA has the lower value in 13 out of 20 datasets, the HHO and the PSO only in 3, the AOA in 2, the MVO in 1, and the rest methods did not achieve the best results in any datasets.
On the other hand, as was previously explained, the Accuracy evaluates the quality of the classification based on the true positive, true negative, false positive, and false negative values. Here it is expected that the values obtained are close to 1, which represents a higher Accuracy. In Table 6 are presented the values for the selected algorithms; the proposed AOAGA is the method that has better accuracy for the classification of the feature. The AOAGA gets the value closer to 1 in 17 of the 20 datasets, the PSO in 3, the MVO and the GWO in 1, and the rest is zero. Moreover, Figure 3 and Figure 4 illustrates the good performance of the proposed method over the average of the fitness values and accuracy measure.
In the feature selection problem, it is necessary to identify a set that contains a reduced number of the most representative features. Then the number of selected features could also be considered a metric to verify the performance of the experiments over the selected datasets. The number of selected features for each algorithm is presented in Table 7. From this table, the AOAGA is the algorithm that has the lower valuer 16 times, the SMA and the GOA in 2 times, the AOA and HHO only 1 time. The values of the rest of the algorithms are higher.
The computational time for each algorithm is analyzed in Table 8. In this case, it is expected to have a reduced value. However, this does not represent a good performance because a fast algorithm is not necessarily accurate. This could be seen in Table 8, where the SMA is the algorithm that has the lower time 13 times, followed by the MFO with 3 times and MVO, SSA, PSO, and GWO with only 1 time. In this case, the proposed AOAGA is the one with higher computational time due to the hybridization of the operator; however, its performance is better than the rest of the algorithms in the comparisons.
Figure 5 and Figure 6 depict the average of fitness value and their boxplots, respectively. From these curves it can be seen that the AOAGA has convergence rate better than other methods. This can be observed from the second half of iterations in most datasets. In addition, it can be noticed from boxplot that the developed AOAGA has the lower box in addition to SMA is the worst MH technique according to the obtained results in this study.
Table 9 shows the mean rank obtained by using non-parametric Friedman test. The main objective of this test is to determine if there is significant difference between the developed AOAGA and other methods. From these results it can be seen that the developed AOAGA has smallest mean rank at thirteen datasets which represent nearly 65% from all datasets. Followed by PSO which has the best mean rank at six datasets nearly 30% from all datasets. This results indicates the ability of developed method to convergence faster than other methods.

4.2.2. Comparison with State-of-the-Art Methods

In this section, the developed AOAGA is compared with other state-of-the-art FS methods namely SMAFA [9], BSSAS3 [67], bGWO2 [68], SbBOA [69], BGOAM [70], Das [71], and S-bBOA [69].
The comparison results between the developed method and other FS methods are given in Table 10. From this table it can be noticed that the developed AOAGA shows good performance and it obtains the higher accuracy at sixteen datasets. Whereas, the SMAFA has the best accuracy at seven dataset, as well as, BGOAM is the best at two datasets.

4.3. Experimental Series 2: Real Application of AOAGA

Survival data with censoring have appeared frequently in real application, such as biology and epidemiology [72,73]. Nowadays, the data of gene expression are increasingly applied to different clinical outcomes in order to facilitate disease diagnosis. Such these data are high dimensional where the number of genes exceeds the number of observations [74]. Regression technique is a standard practice to study jointly the influence of multiple predictors on a response. The Cox regression technique is one of the most standard regression techniques of survival data with censoring. When the the dimensionality of the predictors being large, the traditional method of estimating Cox regression technique is undesirable since its prediction accuracy is low and is hard to interpret [75]. To tackle this issue, feature selection has become an important focus in Cox regression technique.
To examine the performance of the proposed hybrid algorithm, AOAGA, two real gene datasets were used. The first dataset is the Diffuse large B-cell lymphoma (DLBC2002) [54]. DLBC2002 contains 240 lymphoma patients’ samples in which Each patient has 7399 gene expression measurements. The second dataset is the Lung cancer dataset (Lung-cancer) [55]. This dataset contains 86 lung cancer patients’ information, for each of whom 7129 gene expression were measured. For both datasets, the response variable is the survival time including censored or not.

Results and Discussion of Real Gene Datasets

In order to show the performance that can be achieved by our AOAGA and other used algorithms on the two used datasets, the average values, MIN and MAX values of the log-likelihood as a fitness function were given in Table 11, Table 12 and Table 13, respectively. From Table 11, the proposed algorithm, AOAGA, achieved a better performance for the two datasets, respectively. Moreover, it is clear from the results that the AOAGA more successful than the AOA for all datasets. This enhancement is mainly becuase of the developed algorithm ability in taking into account the limitation of the standard AOA algorithm. In terms of standard deviation criterion in Table 14, the AOAGA attained the lowest standard deviation value in both DLBC2002 and Lung-cancer dataset and considered the most stable one than the compared algorithms.
Table 15 summarize the results in average of different used algorithms applied. According to Table 15, the numbers of genes selected by MFO are larger than those of the all used algorithms. Among the other used algorithms, the proposed algorithm, AOAGA, selected less genes. In Lung-cancer dataset, for example, the AOAGA selected least ratio of genes. In DLBC2002 dataset, the AOAGA showed comparable results to SMA algorithm.

5. Conclusions

This paper proposed a novel feature selection method based on an improved Arithmetic Optimization Algorithm (AOA) to generate a new subset of best features. The main idea of the proposed method, called AOAGA is to apply the operators of the genetic algorithm (GA) to boost the performance of the traditional AOA. The traditional AOA suffers from weaknesses, the local search strategy, and the trade-off between the search strategies. Thus, the proposed method works using a new transition mechanism to transfer between the AOA with the GA operators are used to guarantee the solutions’ diversity is kept. We evaluated the AOAGA with twenty well-known benchmark datasets to verify its effectiveness in solving different feature selection problems. More so, several standard evaluation criteria were used to evaluate the results of the AOAGA, including accuracy, the number of selected features, and fitness function. Moreover, to further assess the performance of the proposed AOAGA method, two real-world problems containing gene datasets are used. Finally, the results were compared with several well-known state-of-the-art techniques to prove the performance of the proposed AOAGA method. The results illustrated that the proposed AOAGA method finds new best solutions for different test cases, and it made promising results compared to other comparative methods published in the literature. However, there is certain limitation that must be addressed, such as the computational time of the proposed method in case of high dimensional datasets.
For future work, the proposed AOAGA can be investigated further in order to adapt its operator accurately and get further improvement. It can also be modified differently to adjust its search operators. Moreover, the proposed AOAGA can be tested to solve other benchmark optimization and real-world problems such as clustering, image segmentation, task scheduling in fog computing, medical data classification, sentiment analysis, parameter estimation, and others.

Author Contributions

Conceptualization, A.A.E., M.A.A.A.-q., L.A., D.O., Z.Y.A., R.A.I. and M.A.E.; Data curation, Z.Y.A. and R.A.I.; Formal analysis, A.A.E., M.A.A.A.-q., L.A., D.O. and M.A.E.; Funding acquisition, M.A.E.; Investigation, M.A.A.A.-q., L.A., D.O., Z.Y.A., A.M.A. and R.M.G.; Methodology, A.A.E., L.A., R.A.I. and M.A.E.; Resources, R.M.G.; Validation, A.A.E., L.A., D.O., Z.Y.A., A.M.A., R.A.I., R.M.G. and M.A.E.; Visualization, A.A.E., A.M.A., R.M.G. and M.A.E.; Writing—original draft, A.A.E., M.A.A.A.-q., D.O., Z.Y.A. and M.A.E.; Writing—review & editing, A.A.E., M.A.A.A.-q., A.M.A., R.A.I., R.M.G. and M.A.E. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the Fast-track Research Funding Program.

Data Availability Statement

The data are available on: https://archive.ics.uci.edu/ml/, accessed on 22 August 2021.

Acknowledgments

The authors acknowledge the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ghamisi, P.; Benediktsson, J.A. Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci. Remote Sens. Lett. 2014, 12, 309–313. [Google Scholar] [CrossRef] [Green Version]
  2. Garg, H. A hybrid PSO-GA algorithm for constrained optimization problems. Appl. Math. Comput. 2016, 274, 292–305. [Google Scholar] [CrossRef]
  3. Shao, Z.; Wu, W.; Li, D. Spatio-temporal-spectral observation model for urban remote sensing. Geo-Spat. Inf. Sci. 2021, 1–15. [Google Scholar] [CrossRef]
  4. Ibrahim, A.M.; Tawhid, M.A.; Ward, R.K. A binary water wave optimization for feature selection. Int. J. Approx. Reason. 2020, 120, 74–91. [Google Scholar] [CrossRef]
  5. Şahin, C.B.; Abualigah, L. A novel deep learning-based feature selection model for improving the static analysis of vulnerability detection. Neural Comput. Appl. 2021, 1–19. [Google Scholar] [CrossRef]
  6. Al-qaness, M.A. Device-free human micro-activity recognition method using WiFi signals. Geo-Spat. Inf. Sci. 2019, 22, 128–137. [Google Scholar] [CrossRef]
  7. Abd Elaziz, M.; Dahou, A.; Abualigah, L.; Yu, L.; Alshinwan, M.; Khasawneh, A.M.; Lu, S. Advanced metaheuristic optimization techniques in applications of deep neural networks: A review. Neural Comput. Appl. 2021, 1–21. [Google Scholar] [CrossRef]
  8. Shao, Z.; Sumari, N.S.; Portnov, A.; Ujoh, F.; Musakwa, W.; Mandela, P.J. Urban sprawl and its impact on sustainable urban development: A combination of remote sensing and social media data. Geo-Spat. Inf. Sci. 2021, 24, 241–255. [Google Scholar] [CrossRef]
  9. Ewees, A.A.; Abualigah, L.; Yousri, D.; Algamal, Z.Y.; Al-qaness, M.A.; Ibrahim, R.A.; Abd Elaziz, M. Improved Slime Mould Algorithm based on Firefly Algorithm for feature selection: A case study on QSAR model. Eng. Comput. 2021, 1–15. [Google Scholar] [CrossRef]
  10. Ayesha, S.; Hanif, M.K.; Talib, R. Overview and comparative study of dimensionality reduction techniques for high dimensional data. Inf. Fusion 2020, 59, 44–58. [Google Scholar] [CrossRef]
  11. Molina, L.C.; Belanche, L.; Nebot, À. Feature selection algorithms: A survey and experimental evaluation. In Proceedings of the 2002 IEEE International Conference on Data Mining, Maebashi City, Japan, 9–12 December 2002; pp. 306–313. [Google Scholar]
  12. Şahin, C.B.; Dinler, Ö.B.; Abualigah, L. Prediction of software vulnerability based deep symbiotic genetic algorithms: Phenotyping of dominant-features. Appl. Intell. 2021, 1–17. [Google Scholar] [CrossRef]
  13. Al-Tashi, Q.; Kadir, S.J.A.; Rais, H.M.; Mirjalili, S.; Alhussian, H. Binary optimization using hybrid grey wolf optimization for feature selection. IEEE Access 2019, 7, 39496–39508. [Google Scholar] [CrossRef]
  14. Zhang, X.; Xu, Y.; Yu, C.; Heidari, A.A.; Li, S.; Chen, H.; Li, C. Gaussian mutational chaotic fruit fly-built optimization and feature selection. Expert Syst. Appl. 2020, 141, 112976. [Google Scholar] [CrossRef]
  15. Abd Elaziz, M.; Ewees, A.A.; Ibrahim, R.A.; Lu, S. Opposition-based moth-flame optimization improved by differential evolution for feature selection. Math. Comput. Simul. 2020, 168, 48–75. [Google Scholar] [CrossRef]
  16. Alshaer, H.N.; Otair, M.A.; Abualigah, L.; Alshinwan, M.; Khasawneh, A.M. Feature selection method using improved CHI Square on Arabic text classifiers: Analysis and application. Multimed. Tools Appl. 2021, 80, 10373–10390. [Google Scholar] [CrossRef]
  17. Dela Torre, D.M.G.; Gao, J.; Macinnis-Ng, C. Remote sensing-based estimation of rice yields using various models: A critical review. Geo-Spat. Inf. Sci. 2021, 1–24. [Google Scholar] [CrossRef]
  18. Yang, J.; Honavar, V. Feature subset selection using a genetic algorithm. In Feature Extraction, Construction and Selection; Springer: Berlin/Heidelberg, Germany, 1998; pp. 117–136. [Google Scholar]
  19. Wang, X.; Yang, J.; Teng, X.; Xia, W.; Jensen, R. Feature selection based on rough sets and particle swarm optimization. Pattern Recognit. Lett. 2007, 28, 459–471. [Google Scholar] [CrossRef] [Green Version]
  20. Garg, H. A hybrid GA-GSA algorithm for optimizing the performance of an industrial system by utilizing uncertain data. In Handbook of Research on Artificial Intelligence Techniques and Algorithms; IGI Global: Hershey, PA, USA, 2015; pp. 620–654. [Google Scholar]
  21. Hu, Y.; Zhang, Y.; Gong, D. Multiobjective particle swarm optimization for feature selection with fuzzy cost. IEEE Trans. Cybern. 2020, 51, 874–888. [Google Scholar] [CrossRef]
  22. Garg, H. A hybrid GSA-GA algorithm for constrained optimization problems. Inf. Sci. 2019, 478, 499–523. [Google Scholar] [CrossRef]
  23. Alweshah, M.; Al Khalaileh, S.; Gupta, B.B.; Almomani, A.; Hammouri, A.I.; Al-Betar, M.A. The monarch butterfly optimization algorithm for solving feature selection problems. Neural Comput. Appl. 2020, 1–15. [Google Scholar] [CrossRef]
  24. Agrawal, P.; Ganesh, T.; Mohamed, A.W. A novel binary gaining–sharing knowledge-based optimization algorithm for feature selection. Neural Comput. Appl. 2021, 33, 5989–6008. [Google Scholar] [CrossRef]
  25. Sharma, M.; Kaur, P. A Comprehensive Analysis of Nature-Inspired Meta-Heuristic Techniques for Feature Selection Problem. Arch. Comput. Methods Eng. 2021, 28, 1103–1127. [Google Scholar] [CrossRef]
  26. Ghosh, K.K.; Guha, R.; Bera, S.K.; Kumar, N.; Sarkar, R. S-shaped versus V-shaped transfer functions for binary Manta ray foraging optimization in feature selection problem. Neural Comput. Appl. 2021, 33, 11027–11041. [Google Scholar] [CrossRef]
  27. Hassan, M.H.; Kamel, S.; Abualigah, L.; Eid, A. Development and application of slime mould algorithm for optimal economic emission dispatch. Expert Syst. Appl. 2021, 182, 115205. [Google Scholar] [CrossRef]
  28. Wang, S.; Jia, H.; Abualigah, L.; Liu, Q.; Zheng, R. An Improved Hybrid Aquila Optimizer and Harris Hawks Algorithm for Solving Industrial Engineering Optimization Problems. Processes 2021, 9, 1551. [Google Scholar] [CrossRef]
  29. Altabeeb, A.M.; Mohsen, A.M.; Abualigah, L.; Ghallab, A. Solving capacitated vehicle routing problem using cooperative firefly algorithm. Appl. Soft Comput. 2021, 108, 107403. [Google Scholar] [CrossRef]
  30. Gul, F.; Mir, I.; Abualigah, L.; Sumari, P. Multi-Robot Space Exploration: An Augmented Arithmetic Approach. IEEE Access 2021, 9, 107738–107750. [Google Scholar] [CrossRef]
  31. Abdel-Basset, M.; El-Shahat, D.; El-henawy, I.; de Albuquerque, V.H.C.; Mirjalili, S. A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Syst. Appl. 2020, 139, 112824. [Google Scholar] [CrossRef]
  32. Al-qaness, M.A.; Ewees, A.A.; Abd Elaziz, M. Modified whale optimization algorithm for solving unrelated parallel machine scheduling problems. Soft Comput. 2021, 25, 9545–9557. [Google Scholar] [CrossRef]
  33. de Souza, R.C.T.; de Macedo, C.A.; dos Santos Coelho, L.; Pierezan, J.; Mariani, V.C. Binary coyote optimization algorithm for feature selection. Pattern Recognit. 2020, 107, 107470. [Google Scholar] [CrossRef]
  34. Abualigah, L.M.; Khader, A.T.; Al-Betar, M.A.; Alomari, O.A. Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst. Appl. 2017, 84, 24–36. [Google Scholar] [CrossRef]
  35. Abualigah, L.M.Q. Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
  36. Abualigah, L.M.; Khader, A.T.; Hanandeh, E.S. A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J. Comput. Sci. 2018, 25, 456–466. [Google Scholar] [CrossRef]
  37. Abualigah, L.; Yousri, D.; Abd Elaziz, M.; Ewees, A.A.; Al-qaness, M.; Gandomi, A.H. Aquila Optimizer: A novel meta-heuristic optimization Algorithm. Comput. Ind. Eng. 2021, 157, 107250. [Google Scholar] [CrossRef]
  38. Abualigah, L.M.; Khader, A.T. Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J. Supercomput. 2017, 73, 4773–4795. [Google Scholar] [CrossRef]
  39. Abualigah, L.; Alsalibi, B.; Shehab, M.; Alshinwan, M.; Khasawneh, A.M.; Alabool, H. A parallel hybrid krill herd algorithm for feature selection. Int. J. Mach. Learn. Cybern. 2020, 12, 783–806. [Google Scholar] [CrossRef]
  40. Xue, B.; Zhang, M.; Browne, W.N. Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Trans. Cybern. 2012, 43, 1656–1671. [Google Scholar] [CrossRef] [PubMed]
  41. Aghdam, M.H.; Ghasem-Aghaee, N.; Basiri, M.E. Text feature selection using ant colony optimization. Expert Syst. Appl. 2009, 36, 6843–6853. [Google Scholar] [CrossRef]
  42. Al-Qaness, M.A.; Fan, H.; Ewees, A.A.; Yousri, D.; Abd Elaziz, M. Improved ANFIS model for forecasting Wuhan City air quality and analysis COVID-19 lockdown impacts on air quality. Environ. Res. 2021, 194, 110607. [Google Scholar] [CrossRef]
  43. Ibrahim, R.A.; Ewees, A.A.; Oliva, D.; Abd Elaziz, M.; Lu, S. Improved salp swarm algorithm based on particle swarm optimization for feature selection. J. Ambient. Intell. Humaniz. Comput. 2019, 10, 3155–3169. [Google Scholar] [CrossRef]
  44. Patwal, R.S.; Narang, N.; Garg, H. A novel TVAC-PSO based mutation strategies algorithm for generation scheduling of pumped storage hydrothermal system incorporating solar units. Energy 2018, 142, 822–837. [Google Scholar] [CrossRef]
  45. El-Kenawy, E.S.; Eid, M. Hybrid gray wolf and particle swarm optimization for feature selection. Int. J. Innov. Comput. Inf. Control 2020, 16, 831–844. [Google Scholar]
  46. Anter, A.M.; Ali, M. Feature selection strategy based on hybrid crow search optimization algorithm integrated with chaos theory and fuzzy c-means algorithm for medical diagnosis problems. Soft Comput. 2020, 24, 1565–1584. [Google Scholar] [CrossRef]
  47. Tubishat, M.; Ja’afar, S.; Alswaitti, M.; Mirjalili, S.; Idris, N.; Ismail, M.A.; Omar, M.S. Dynamic salp swarm algorithm for feature selection. Expert Syst. Appl. 2021, 164, 113873. [Google Scholar] [CrossRef]
  48. Abualigah, L.; Diabat, A.; Mirjalili, S.; Abd Elaziz, M.; Gandomi, A.H. The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 2021, 376, 113609. [Google Scholar] [CrossRef]
  49. Goldberg, D.E.; Holland, J.H. Genetic algorithms and machine learning. Mach. Learn. 1988, 3, 95–99. [Google Scholar] [CrossRef]
  50. Eshelman, L.J.; Schaffer, J.D. Real-coded genetic algorithms and interval-schemata. In Foundations of Genetic Algorithms; Elsevier: Amsterdam, The Netherlands, 1993; Volume 2, pp. 187–202. [Google Scholar]
  51. Higashi, N.; Iba, H. Particle swarm optimization with Gaussian mutation. In Proceedings of the 2003 IEEE Swarm Intelligence Symposium. SIS’03 (Cat. No.03EX706), Indianapolis, IN, USA, 26 April 2003; pp. 72–79. [Google Scholar]
  52. Lipowski, A.; Lipowska, D. Roulette-wheel selection via stochastic acceptance. Phys. A Stat. Mech. Its Appl. 2012, 391, 2193–2196. [Google Scholar] [CrossRef] [Green Version]
  53. Dua, D.; Graff, C. UCI Machine Learning Repository. 2017. Available online: https://archive.ics.uci.edu/ml/ (accessed on 22 August 2021).
  54. Rosenwald, A.; Wright, G.; Chan, W.C.; Connors, J.M.; Campo, E.; Fisher, R.I.; Gascoyne, R.D.; Muller-Hermelink, H.K.; Smeland, E.B.; Giltnane, J.M.; et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. 2002, 346, 1937–1947. [Google Scholar] [CrossRef]
  55. Beer, D.G.; Kardia, S.L.; Huang, C.C.; Giordano, T.J.; Levin, A.M.; Misek, D.E.; Lin, L.; Chen, G.; Gharib, T.G.; Thomas, D.G.; et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat. Med. 2002, 8, 816–824. [Google Scholar] [CrossRef]
  56. Li, S.; Chen, H.; Wang, M.; Heidari, A.A.; Mirjalili, S. Slime mould algorithm: A new method for stochastic optimization. Future Gener. Comput. Syst. 2020, 111, 300–323. [Google Scholar] [CrossRef]
  57. Heidari, A.A.; Mirjalili, S.; Faris, H.; Aljarah, I.; Mafarja, M.; Chen, H. Harris hawks optimization: Algorithm and applications. Future Gener. Comput. Syst. 2019, 97, 849–872. [Google Scholar] [CrossRef]
  58. Forrest, S. Genetic algorithms. ACM Comput. Surv. (CSUR) 1996, 28, 77–80. [Google Scholar] [CrossRef]
  59. Ewees, A.A.; Abd El Aziz, M.; Hassanien, A.E. Chaotic multi-verse optimizer-based feature selection. Neural Comput. Appl. 2019, 31, 991–1006. [Google Scholar] [CrossRef]
  60. Mirjalili, S.; Gandomi, A.H.; Mirjalili, S.Z.; Saremi, S.; Faris, H.; Mirjalili, S.M. Salp Swarm Algorithm: A bio-inspired optimizer for engineering design problems. Adv. Eng. Softw. 2017, 114, 163–191. [Google Scholar] [CrossRef]
  61. Ewees, A.A.; Abd Elaziz, M.; Houssein, E.H. Improved grasshopper optimization algorithm using opposition-based learning. Expert Syst. Appl. 2018, 112, 156–172. [Google Scholar] [CrossRef]
  62. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef] [Green Version]
  63. Gu, S.; Cheng, R.; Jin, Y. Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft Comput. 2018, 22, 811–822. [Google Scholar] [CrossRef] [Green Version]
  64. Hu, P.; Pan, J.S.; Chu, S.C. Improved binary grey wolf optimizer and its application for feature selection. Knowl.-Based Syst. 2020, 195, 105746. [Google Scholar] [CrossRef]
  65. Neggaz, N.; Ewees, A.A.; Abd Elaziz, M.; Mafarja, M. Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection. Expert Syst. Appl. 2020, 145, 113103. [Google Scholar] [CrossRef]
  66. Dhiman, G.; Oliva, D.; Kaur, A.; Singh, K.K.; Vimal, S.; Sharma, A.; Cengiz, K. BEPO: A novel binary emperor penguin optimizer for automatic feature selection. Knowl.-Based Syst. 2021, 211, 106560. [Google Scholar] [CrossRef]
  67. Faris, H.; Mafarja, M.M.; Heidari, A.A.; Aljarah, I.; Ala’M, A.Z.; Mirjalili, S.; Fujita, H. An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowl.-Based Syst. 2018, 154, 43–67. [Google Scholar] [CrossRef]
  68. Emary, E.; Zawbaa, H.M.; Hassanien, A.E. Binary grey wolf optimization approaches for feature selection. Neurocomputing 2016, 172, 371–381. [Google Scholar] [CrossRef]
  69. Arora, S.; Anand, P. Binary butterfly optimization approaches for feature selection. Expert Syst. Appl. 2019, 116, 147–160. [Google Scholar] [CrossRef]
  70. Mafarja, M.; Aljarah, I.; Faris, H.; Hammouri, A.I.; Ala’M, A.Z.; Mirjalili, S. Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst. Appl. 2019, 117, 267–286. [Google Scholar] [CrossRef]
  71. Das, A.; Das, S. Feature weighting and selection with a Pareto-optimal trade-off between relevancy and redundancy. Pattern Recognit. Lett. 2017, 88, 12–19. [Google Scholar] [CrossRef]
  72. Cockeran, M.; Meintanis, S.G.; Allison, J.S. Goodness-of-fit tests in the Cox proportional hazards model. Commun. Stat.-Simul. Comput. [CrossRef]
  73. Emura, T.; Chen, Y.H.; Chen, H.Y. Survival prediction based on compound covariate under cox proportional hazard models. PLoS ONE 2012, 7, e47627. [Google Scholar]
  74. Jiang, H.K.; Liang, Y. The L1/2 regularization network Cox model for analysis of genomic data. Comput. Biol. Med. 2018, 100, 203–208. [Google Scholar] [CrossRef] [PubMed]
  75. Leng, C.; Helen Zhang, H. Model selection in nonparametric hazard regression. Nonparametr. Stat. 2006, 18, 417–429. [Google Scholar] [CrossRef]
Figure 1. Single point crossover.
Figure 1. Single point crossover.
Mathematics 09 02321 g001
Figure 2. Flowchart of the proposed method.
Figure 2. Flowchart of the proposed method.
Mathematics 09 02321 g002
Figure 3. Average of the fitness values for all algorithms.
Figure 3. Average of the fitness values for all algorithms.
Mathematics 09 02321 g003
Figure 4. Average of the accuracy measure values for all algorithms.
Figure 4. Average of the accuracy measure values for all algorithms.
Mathematics 09 02321 g004
Figure 5. Examples of the convergence curves for the compared methods.
Figure 5. Examples of the convergence curves for the compared methods.
Mathematics 09 02321 g005
Figure 6. Examples of the boxplot for the compared methods.
Figure 6. Examples of the boxplot for the compared methods.
Mathematics 09 02321 g006
Table 1. Description of the benchmark datasets.
Table 1. Description of the benchmark datasets.
NameNumber of FeaturesNumber of InstancesNumber of ClassesData Category
breastWDBC305692Biology
ionosphere343512Physical
wine131783Chemistry
breastcancer96992Biology
sonar602082Biology
glass92147Physics
tic-tac-toe99582Game
Lymphography181482Biology
waveform4050003Physics
clean1data1664762Artificial
Zoo161016Artificial
SPECT222672Biology
ecoli73368Biology
CongressEW164352Politics
M-of-n1310002Biology
Exactly1310002Biology
Exactly21310002Biology
Vote163002Politics
heart132702Biology
krvskp3631962Game
Table 2. Results of the Fitness values for all methods.
Table 2. Results of the Fitness values for all methods.
DatasetAOAGAAOASMAHHOGAMVOSSAMFOGOAPSOGWO
breastWDBC0.09680.11070.35030.12070.12610.13290.11220.17540.21340.10580.1080
ionosphere0.15670.20860.39200.20490.22620.21670.22130.27040.32010.18280.1803
wine0.00000.04800.18580.00670.01510.01940.06100.12520.15780.00430.0117
breastcancer0.16200.22160.39470.18980.21590.21400.20820.25800.32930.16820.1640
glass0.14060.14520.21830.14340.15200.14920.14990.18790.22030.14190.1451
sonar0.13870.22320.41160.20650.20830.17690.18850.27030.33320.12040.1423
Lymphography0.25470.32610.53420.30540.35610.32450.31880.45230.51440.28180.3221
tic-tac-toe0.00000.20790.53940.00220.15600.13700.02550.44050.52270.00180.1513
waveform0.63230.66460.90590.65610.65120.65510.64890.67190.73070.63490.6435
clean1data0.23050.24720.43840.26330.25890.24650.26800.26920.34700.22400.2092
SPECT0.29820.36400.47800.34420.36330.35720.35350.40280.47280.32870.3418
Zoo0.00000.01540.20630.00290.01450.01370.04430.09660.12600.00380.0083
ecoli0.19450.21780.34640.22080.22710.22590.22630.27120.33550.22020.2212
CongressEW0.10900.14780.40350.16450.18420.17070.18120.23080.30250.13630.1565
Exactly0.00000.20920.58580.05390.18970.16590.05760.43330.59440.00000.0869
Exactly20.46990.50300.56990.49290.50480.50600.50810.54470.58160.48840.4970
M-of-n0.00000.18670.47900.03830.14190.06450.03880.30960.49550.00000.0381
Vote0.09290.18710.41150.17270.20150.19060.18710.25950.34310.16260.1719
krvskp0.15410.18570.52810.17520.19540.16390.17180.15780.35340.11920.1628
heart0.33660.38840.54250.35750.37940.39980.36170.42550.49690.34710.3941
Table 3. Results of the MIN measure for all methods.
Table 3. Results of the MIN measure for all methods.
DatasetAOAGAAOASMAHHOGAMVOSSAMFOGOAPSOGWO
breastWDBC0.08390.08390.08390.00000.00000.00000.00000.11870.11870.00000.0000
ionosphere0.00000.00000.21320.10660.10660.10660.10660.15080.18460.10660.0000
wine0.00000.00000.00000.00000.00000.00000.00000.00000.00000.00000.0000
breastcancer0.15080.10660.18460.00000.10660.10660.10660.10660.21320.10660.0000
glass0.08690.08690.09910.08690.08690.09520.08690.11660.12890.08690.0869
sonar0.00000.00000.27740.00000.13870.00000.00000.13870.19610.00000.0000
Lymphography0.16440.16440.36760.16440.23250.23250.23250.23250.28470.16440.2325
tic-tac-toe0.00000.00000.00000.00000.00000.00000.00000.25870.00000.00000.0000
waveform0.61450.62670.69170.62550.61120.61190.60070.59730.63060.61120.6138
clean1data0.22450.15880.27500.15880.18330.12960.18330.15880.25930.15880.1296
SPECT0.27320.27320.27320.17280.24430.24430.27320.27320.36650.21160.2116
Zoo0.00000.00000.00000.00000.00000.00000.00000.00000.00000.00000.0000
ecoli0.15090.15190.20320.17500.17500.17500.17500.19030.20560.17500.1750
CongressEW0.00000.00000.19160.09580.09580.09580.09580.09580.09580.00000.0000
Exactly0.00000.00000.00000.00000.00000.00000.00000.00000.00000.00000.0000
Exactly20.44270.45170.46480.44270.44270.44270.44270.46480.52920.44270.4427
M-of-n0.00000.00000.00000.00000.00000.00000.00000.00000.00000.00000.0000
Vote0.00000.11550.16330.11550.00000.11550.00000.11550.16330.00000.0000
krvskp0.15010.10610.22090.13700.14150.12260.13240.10010.20320.07910.1061
heart0.32320.29930.36650.29930.32320.34550.27320.29930.34550.27320.3232
Table 4. Results of the MAX measure for all methods.
Table 4. Results of the MAX measure for all methods.
DatasetAOAGAAOASMAHHOGAMVOSSAMFOGOAPSOGWO
breastWDBC0.11870.11870.67660.18760.20560.18760.23740.26540.31400.18760.1876
ionosphere0.21320.31980.55390.28200.30150.30150.33710.41290.47670.26110.2611
wine0.00000.07540.42640.07540.07540.07540.28200.26110.34540.07540.1066
breastcancer0.18460.31980.57410.26110.33710.30150.31980.39890.45230.26110.2820
glass0.18220.21460.28940.19030.19030.19810.26640.28550.34420.19030.1903
sonar0.13870.33970.53710.33970.39220.36690.36690.48040.45990.33970.2774
Lymphography0.28470.49320.67780.40270.46500.43500.63670.67780.73520.40270.4350
tic-tac-toe0.00000.39350.75160.06470.45740.48840.57860.65010.73180.06470.4387
waveform0.65120.71721.14860.69690.68880.70710.68350.77200.86860.65730.6741
clean1data0.24250.34300.71010.36670.33050.34300.38890.34300.46740.33050.2899
SPECT0.32320.47320.68020.44050.45710.45710.45710.53250.59850.42320.4405
Zoo0.00000.03330.44470.03330.04710.06670.28280.29250.36360.03330.0577
ecoli0.22260.28610.58980.28060.28190.28190.38180.37210.77030.28060.2844
CongressEW0.13550.27090.71030.25340.27090.23460.40640.40640.57470.23460.2346
Exactly0.00000.55860.74300.36880.57620.54040.55140.64190.72660.00000.5477
Exactly20.48170.56210.70710.54410.54410.55500.64190.60000.72110.54410.5441
M-of-n0.00000.40000.64190.32250.41470.37950.57620.64190.62610.00000.4690
Vote0.16330.28280.79160.23090.28280.25820.40000.47610.48990.25820.2828
krvskp0.16210.28960.68960.22090.25260.20930.23990.22930.59830.16590.2694
heart0.34550.51830.72280.44050.47320.47320.44050.54640.64650.42320.4571
Table 5. Results of the standard deviation for all methods.
Table 5. Results of the standard deviation for all methods.
DatasetAOAGAAOASMAHHOGAMVOSSAMFOGOAPSOGWO
breastWDBC0.01590.01460.16650.03830.04560.03890.05710.04130.04470.03910.0511
ionosphere0.06260.06800.09150.04320.04570.06080.05140.06210.06590.04780.0567
wine0.00000.03470.11060.02140.03020.03290.08460.05160.06580.01750.0289
breastcancer0.01600.05210.10660.04800.04560.04410.04910.06740.05690.04480.0483
glass0.03350.02960.04870.02210.02510.02500.03190.04310.04850.02440.0279
sonar0.05550.07580.07710.07180.06660.08470.08220.08770.06580.09390.0941
Lymphography0.05210.08320.08400.05700.06100.06590.08040.10790.11580.06410.0528
tic-tac-toe0.00000.16500.22970.01160.17780.17260.10400.09470.19540.01080.1639
waveform0.01500.02290.13690.01320.01660.02170.01910.03720.05410.01270.0166
clean1data0.00850.05160.09510.04170.03720.04350.04590.04670.05470.03820.0370
SPECT0.02500.04770.09940.05510.05290.05380.05030.06690.06480.04430.0507
Zoo0.00000.01600.15660.00930.01810.02040.07000.06200.09620.01060.0158
ecoli0.03050.03370.10480.02210.02600.02570.03450.04540.11000.02310.0233
CongressEW0.04190.04920.15530.04260.04300.03810.06910.06970.10880.06120.0460
Exactly0.00000.21620.13360.08270.22360.20050.11780.20650.11630.00000.1944
Exactly20.01620.02950.06940.02460.02350.02920.04120.03070.04070.02590.0218
M-of-n0.00000.15400.20080.06830.15090.11710.11000.18360.12690.00000.1161
Vote0.06850.03730.15050.03810.05850.04360.06750.07760.08730.06500.0614
krvskp0.00430.04090.12070.02050.02870.01860.02680.03110.11640.01760.0308
heart0.01090.04240.09320.03950.04310.03630.03900.06330.07940.03780.0377
Table 6. Results of the accuracy measure for all methods.
Table 6. Results of the accuracy measure for all methods.
DatasetAOAGAAOASMAHHOGAMVOSSAMFOGOAPSOGWO
breastWDBC0.99050.98750.84960.98400.98200.98080.98420.96750.95250.98800.9857
ionosphere0.97850.95170.83800.95620.94680.94940.94840.92310.89320.96430.9643
wine1.00000.98550.85750.99800.99550.99420.96430.92860.89680.99870.9961
breastcancer0.97350.94810.83290.96170.95130.95230.95420.92890.88830.96970.9708
glass0.80660.75470.65930.66950.62530.61560.68730.55850.51000.69650.6491
sonar0.98460.94420.82460.95220.95220.96150.95770.91920.88460.97670.9709
Lymphography0.93240.88650.70760.57300.72970.57070.72900.49810.40000.80920.6958
tic-tac-toe1.00000.92830.65630.99990.94410.95150.98850.77900.68860.99990.9503
waveform0.79280.77610.60050.78480.78310.78260.78740.77490.74110.79210.7857
clean1data0.94680.93610.79880.92890.93160.93730.92610.92530.87660.94840.9549
SPECT0.91040.86520.75930.87840.86390.86950.87250.83330.77230.89000.8806
Zoo1.00000.98150.91380.99660.92690.90290.72910.37260.23890.99540.9634
ecoli0.84050.78530.60320.84050.84230.84520.83930.78570.81850.84130.8405
CongressEW0.98780.97560.83240.97110.96420.96940.96240.94190.89660.97770.9734
Exactly1.00000.90720.63630.99030.89170.93230.98280.76960.63321.00000.9313
Exactly20.77900.74610.63580.75050.73750.73490.72950.69860.66010.75940.7478
M-of-n1.00000.93930.73380.99390.95710.98210.98640.87040.73841.00000.9851
Vote0.98670.96360.80800.96870.95600.96180.96040.92670.87470.96930.9667
krvskp0.97620.96380.70650.96890.96100.97280.96980.97410.86150.98550.9725
heart0.88660.84730.68110.87060.85420.83880.86770.81490.74680.87810.8433
Table 7. Selected features number obtained by each algorithm.
Table 7. Selected features number obtained by each algorithm.
DatasetAOAGAAOASMAHHOGAMVOSSAMFOGOAPSOGWO
breastWDBC10.5013.2311.1315.0315.9715.8315.7815.5315.4715.0913.97
ionosphere11.0012.3211.2311.9415.4915.5714.4916.0617.2914.1112.40
wine6.226.276.737.567.467.267.007.346.497.497.06
breastcancer11.0013.2712.1311.0915.7415.8614.5116.0916.3114.7012.31
glass4.004.904.824.935.315.345.235.094.804.974.74
sonar24.0024.5024.8127.5429.6930.7129.6030.3129.9129.1524.14
Lymphography7.337.677.739.4310.149.718.579.869.349.008.06
tic-tac-toe9.008.055.419.008.318.378.866.295.009.008.29
waveform11.3312.336.1114.5413.2013.1713.6013.9711.5113.5011.69
clean1data47.5063.5748.9071.6381.7479.9182.0085.6382.0981.0658.54
SPECT8.508.209.289.2610.7411.0310.8011.8911.3410.499.71
Zoo7.568.627.809.919.409.319.148.638.409.519.63
ecoli5.204.735.565.134.805.294.974.203.514.945.11
CongressEW3.836.664.216.637.577.707.437.537.637.006.13
Exactly6.567.297.107.407.607.377.208.136.676.836.83
Exactly23.004.503.904.006.075.405.736.407.236.505.30
M-of-n7.257.825.787.537.707.877.377.876.877.177.13
Vote6.176.706.806.107.977.507.337.778.077.676.13
krvskp12.8317.2312.9020.4020.4320.0319.9720.5718.5719.6015.57
heart6.256.537.377.177.608.407.377.176.937.376.70
Table 8. Processing time for all algorithms.
Table 8. Processing time for all algorithms.
DatasetAOAGAAOASMAHHOGAMVOSSAMFOGOAPSOGWO
breastWDBC38.22656.8686.60815.7057.0756.3376.3526.2946.9596.3016.334
ionosphere35.94451.7736.39315.4326.8666.1606.1846.1246.8126.1746.213
wine12.85152.9216.21615.4187.1746.4426.4196.4476.7256.4606.418
breastcancer34.48052.4916.25315.0406.7456.0526.0216.0436.6856.0366.076
glass35.30751.1655.04511.6077.3776.5766.6536.5686.7786.5576.617
sonar34.13147.9566.28814.6566.7146.0296.0406.0227.1436.0615.974
Lymphography29.14048.2305.27413.1856.6455.5765.8685.8056.1755.7175.655
tic-tac-toe28.22161.4216.95815.5628.1527.3417.3617.3577.3197.2597.153
waveform208.100249.78512.39841.03420.35818.16318.18918.70418.46018.12617.410
clean1data41.63262.0676.68816.7417.8377.0647.0477.07310.0887.0146.762
SPECT33.49950.2945.86514.6406.6986.0486.0916.0246.4776.0596.116
Zoo15.74250.0404.93414.8177.3376.3176.3856.4286.5036.1875.824
ecoli25.23831.6324.54111.4255.7725.1955.1895.2405.2695.1785.209
CongressEW34.10952.5746.45315.5277.2526.4746.4686.4486.8306.4866.474
Exactly26.76457.3606.68616.4827.8917.2027.0587.0337.3497.1107.066
Exactly233.94849.7146.93616.1907.7726.8566.8896.9887.5847.0256.968
M-of-n14.28257.8576.88716.7648.0547.2057.2317.2277.5547.2007.232
Vote34.80452.2746.43915.5617.2476.5276.4666.5446.7336.5576.508
krvskp122.824169.70410.45029.57714.12112.64912.54812.69813.11912.57311.564
heart33.90251.46710.30224.90411.52910.48610.31810.32211.03810.27810.636
Table 9. Results of the Friedman statistical test.
Table 9. Results of the Friedman statistical test.
DatasetAOAGAAOASMAHHOGAMVOSSAMFOGOAPSOGWO
breastWDBC3.1004.05010.0335.6675.9176.3674.5338.6679.9003.2174.550
ionosphere2.0502.81710.3335.3336.5506.2336.7178.21710.0003.8003.950
wine3.9005.63310.1334.0004.6004.7336.0008.9339.6674.0334.367
breastcancer2.2673.15010.1175.4176.7676.5176.4678.08310.2003.3833.633
glass2.7173.1179.9004.7006.3335.8005.1179.16710.0674.1174.967
sonar3.5833.98310.7005.9676.5675.3505.7007.8839.5833.4834.700
Lymphography2.9173.76710.5174.8006.7005.6004.9338.8179.4003.2505.300
tic-tac-toe3.8174.5679.9174.4175.7505.6334.0009.3009.4674.3505.783
waveform4.1335.60010.8006.6835.8006.2505.5176.9339.8672.0505.367
clean1data4.0684.70010.8507.0336.1335.5176.4506.7839.9503.7503.150
SPECT3.5006.55010.0335.3676.7506.3005.7008.1339.8173.7505.100
Zoo3.7176.0679.6003.7175.3005.2005.9679.0179.2003.7174.500
ecoli2.9334.7009.0834.3005.7835.5834.9508.9839.8334.6175.233
CongressEW3.6833.83310.2834.4176.3505.4506.0008.8679.3172.7835.567
Exactly3.6506.20010.0174.5675.7675.5674.3338.50010.1833.1674.050
Exactly23.8504.6678.7334.3176.3006.3175.6508.96710.1334.0834.983
M-of-n3.9176.1509.9334.7506.3334.8334.2838.4839.8333.4334.050
Vote3.8674.10010.2504.2836.8675.7675.3508.33310.1174.3674.700
krvskp2.9004.45010.8336.7007.6335.5336.2004.96710.0501.5335.200
heart2.4003.81710.3174.9176.2676.2835.0337.9179.7003.8175.533
Table 10. Comparison with the state-of-the-art methods.
Table 10. Comparison with the state-of-the-art methods.
DatasetsAOAGASMAFA [9]BSSAS3 [67]bGWO2 [68]SbBOA [69]BGOAM [70]Das [71]S-bBOA [69]
breastWDBC0.9900.9890.9480.9350.9710.970-0.971
ionosphere0.9790.9710.9180.8340.9070.9460.8650.907
wine1.0001.0000.9930.9200.9840.9890.9610.984
breastcancer0.9730.9760.9760.9750.9690.9740.9710.969
glass0.8070.795----0.692-
sonar0.9850.9890.9370.7290.9360.9150.7930.936
Lymphography0.9320.9300.8900.7000.8680.912-0.868
tic-tac-toe1.0000.8570.821-0.7980.791-0.798
waveform0.7930.7930.7330.7890.7430.751-0.743
clean1data0.9470.9490.8800.7270.883--0.883
SPECT0.9100.9060.8360.8220.8460.826-0.846
Zoo1.0001.0001.0000.8790.9780.9580.9600.978
ecoli0.8400.857----0.789-
CongressEW0.9880.9870.9630.9380.9590.9760.5260.959
Exactly1.0000.9990.9800.7760.9721.000-0.972
Exactly20.7790.7740.7580.7500.7600.736-0.760
M-of-n1.0001.0000.9910.9630.9721.000-0.972
Vote0.9870.9810.9510.9200.9650.963-0.965
krvskp0.9760.9760.9640.9560.9660.974-0.966
heart0.8870.8850.8600.7760.8240.8360.7840.824
Table 11. Real Application: Average of the fitness value.
Table 11. Real Application: Average of the fitness value.
DatasetAOAGAAOASMAGAHHOPSOSSAMFOGOA
DLBC2002−234.3245−230.2966−233.6701−230.0083−232.3211−230.4959−229.1562−228.8837−227.5548
Lung_cancer−62.9218−59.5755−57.7912−60.0389−62.3429−59.6144−58.6842−57.4670−58.4467
Table 12. Real Application: Minimum of the fitness value.
Table 12. Real Application: Minimum of the fitness value.
MINAOAGAAOASMAGAHHOPSOSSAMFOGOA
DLBC2002−236.4848−234.0571−236.4848−234.8838−235.6193−234.8838−232.6768−231.3313−230.0000
Lung_cancer−63.9350−63.9350−59.1967−63.9350−63.9350−63.9350−63.9350−60.8395−60.8263
Table 13. Real Application: Maximum of the fitness value.
Table 13. Real Application: Maximum of the fitness value.
AOAGAAOASMAGAHHOPSOSSAMFOGOA
breastWDBC−231.3218−227.6297−231.2671−227.3983−231.1883−227.8288−226.5824−226.0087−224.0000
Lung_cancer−59.2110−57.5195−55.8756−57.5195−59.1744−57.5195−56.9506−56.1811−56.0671
Table 14. Real Application: Standard deviation of the fitness value.
Table 14. Real Application: Standard deviation of the fitness value.
AOAGAAOASMAGAHHOPSOSSAMFOGOA
DLBC20021.31641.92931.76832.12151.45012.10301.79371.65172.3251
Lung_cancer1.19331.88621.25762.29211.80031.85221.99291.25272.3796
Table 15. Real Application: Ratio of the selected features.
Table 15. Real Application: Ratio of the selected features.
AOAGAAOASMAGAHHOPSOSSAMFOGOA
DLBC20020.28090.28310.27620.49870.28670.50100.50100.53120.5008
Lung_cancer0.35040.35890.35210.50360.39870.49830.50010.52270.4961
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ewees, A.A.; Al-qaness, M.A.A.; Abualigah, L.; Oliva, D.; Algamal, Z.Y.; Anter, A.M.; Ali Ibrahim, R.; Ghoniem, R.M.; Abd Elaziz, M. Boosting Arithmetic Optimization Algorithm with Genetic Algorithm Operators for Feature Selection: Case Study on Cox Proportional Hazards Model. Mathematics 2021, 9, 2321. https://doi.org/10.3390/math9182321

AMA Style

Ewees AA, Al-qaness MAA, Abualigah L, Oliva D, Algamal ZY, Anter AM, Ali Ibrahim R, Ghoniem RM, Abd Elaziz M. Boosting Arithmetic Optimization Algorithm with Genetic Algorithm Operators for Feature Selection: Case Study on Cox Proportional Hazards Model. Mathematics. 2021; 9(18):2321. https://doi.org/10.3390/math9182321

Chicago/Turabian Style

Ewees, Ahmed A., Mohammed A. A. Al-qaness, Laith Abualigah, Diego Oliva, Zakariya Yahya Algamal, Ahmed M. Anter, Rehab Ali Ibrahim, Rania M. Ghoniem, and Mohamed Abd Elaziz. 2021. "Boosting Arithmetic Optimization Algorithm with Genetic Algorithm Operators for Feature Selection: Case Study on Cox Proportional Hazards Model" Mathematics 9, no. 18: 2321. https://doi.org/10.3390/math9182321

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop