Introduction

In combinatorial optimization, the knapsack problem is one of the most challenging and NP-hard problems. It has been studied in the last few years and come up with various real-world applications such as resource allocation, selection of portfolios, assignment, and reliability problems [25]. Let the number of items d with profits \(p_{k};~(k=1,2,\ldots ,d)\) and weights \(w_{k};~(k=1,2,\ldots ,d)\) are packed in a knapsack of maximum capacity \(w_\mathrm{max}\). \(x_{k};~(k=1,2,\ldots ,d)\) represents the selection of item whether it is selected in a knapsack or not. Therefore, \(x_{k}\) takes only two values 0 and 1; 0 means that kth item is not selected, and 1 represents the selection in a knapsack, and the selection of each item is at most once. The mathematical model of \(0-1\) knapsack problem (0-1KP) is given as:

Inputs: Number of items d

\(p_{k}:[k] \rightarrow \mathbb {N}\), \(w_{k}:[k]\rightarrow \mathbb {N}\), \(w_\mathrm{max}\in \mathbb {N}\):

$$\begin{aligned}&\text {Objective funtion:}~~~ \max ~f=\sum _{k=1}^{d}~p_{k}x_{k} \end{aligned}$$
(1)
$$\begin{aligned}&\text {Constraints:}~~~ \sum _{k=1}^{d}~w_{k}x_{k} \le w_\mathrm{max} \end{aligned}$$
(2)
$$\begin{aligned}&x_{k} = 0 ~\text {or}~ 1;~~k=1,2,\ldots ,d. \end{aligned}$$
(3)

The main aim of the knapsack problem is to maximize the profits of items, such that the total weights of selected items must be less than or equal to the capacity of the knapsack. In reality, 0-1KP are non-differentiable, discontinuous, and high-dimensional problems; therefore, it is not possible to apply the classical approach such as branch and bound method [17] and dynamic programming [7]. For example, due to consideration of high dimensions for 0-1KP, choosing the global optimal solution from the exhaustive set of feasible solution is not realistic. Hence, to overcome these difficulties, numerous metaheuristic algorithms have been developed and studied in the last 3 decades. In metaheuristic algorithms, there is no need for continuity and differentiability of the objective functions. There are various algorithms that have been developed to solve the complex optimization problems such as genetic algorithm, ant colony optimization, differential evolution, particle swarm optimization algorithm, etc., and applied to various real-world problems such as two-agent multi-facility customer order scheduling [23], earthquake casualty prediction [11], task scheduling [39], flow shop scheduling problem [15, 16], etc.

Many metaheuristic algorithms have been proposed to solve 0-1KP in recent years. Shi [33] modified ant colony optimization algorithm to solve the classical 0-1KP, whereas Lin [22] used a genetic algorithm to obtain the solutions of knapsack problem with uncertain weights. Li and Li [21] proposed a binary particle swarm optimization algorithm with a multi-mutation to tackle the knapsack problem. A schema guiding evolutionary algorithm has been proposed for the knapsack problem by Liu and Liu [24]. Truong et al. [34] profounded chemical reaction optimization algorithm for solving 0-1KP with the greedy strategy to repair the infeasible solutions. Researchers pay great attention to develop the binary and discrete versions of various algorithms such as binary artifical fish swarm algorithm [3], adaptive binary Harmony Search algorithm [36], binary monkey algorithm [40], binary multi-verse optimizer [1], and discrete shuffled frog leaping algorithm [5] for solving the 0-1KP. Many algorithms have been developed to solve only knapsack problems which solve only low-dimensional knapsack problems. The real-world issues consider very high dimensions, and it is challenging to handle high-dimensional problems. Zou et al. [42] proposed a novel global harmony search algorithm with genetic mutation for obtaining the solution of knapsack problems. Moosavian [32] proposed Soccer league competition algorithm to tackle with the high dimensions of knapsack problems.

Out of these metaheuristic algorithms, gaining sharing knowledge based optimization algorithm (GSK), is recently developed human-based algorithm over continuous space [30]. GSK is based on the ideology of how human acquires and shares knowledge during their life-time. It depends on the two essential stages: junior or beginners gaining and sharing stage and senior or experts gaining and sharing stage. To enhance their skills, persons gain knowledge from their networks and share the acquired knowledge with the other persons in both stages.

Fig. 1
figure 1

Pseudocode for junior gaining sharing knowledge stage

Fig. 2
figure 2

Pseudocode for senior gaining sharing knowledge stage

GSK algorithm is applied to the continuous optimization problems, and the obtained results prove its robustness, efficiency, and ability to find optimal solutions for the problems. The GSK algorithm has shown its significant capability in solving two different sets over continuous space , the CEC2017 benchmark suite (30 unconstrained problems with dimensions 10, 30, 50, and 100) [2] in addition to CEC2011 benchmark suite (22 constrained problems with dimensions from 1 to 140) [12]. Moreover, it outperforms the most famous 10 metaheuristics such as differential evolution, particle swarm optimization, genetic algorithm, grey wolf optimizer, teaching learning based optimization, ant colony optimization, stochastic fractal search, animal migration optimization, and many others which reflects in turn its outstanding performance compared with other metaheuristics. The manuscript proposes a novel binary gaining sharing knowledge-based optimization algorithm (NBGSK) to solve the binary optimization problems. NBGSK algorithm has two requisites: binary junior or beginners gaining and sharing stage, and binary senior or experts gaining and sharing stage. These two stages enable NBGSK to explore the search space and intensify the exploitation tendency efficiently and effectively. The proposed NBGSK algorithm is applied to the NP-hard 0-1KP to check the performance of NBGSK and, the obtained solutions are compared with existing results from literature [32].

Fig. 3
figure 3

Flowchart of GSK algorithm

As the population size is one of the most important parameters of any metaheuristic algorithm; therefore, choosing the appropriate size of the population is a very critical task. A large number of population size extend the diversity, but use more numbers of function evaluations. On the other hand, by considering a small number of population size, it may trap into local optima. From the literature, there are following observation to choose the population size:

Table 1 Results of binary junior gaining and sharing stage of Case 1 with \(k_\mathrm{f}=1\)
  • The population size may be different for every problem [9].

  • It can be based on the dimension of the problems [31].

  • It may be varied or fixed throughout the optimization process according to the problems [6, 18].

Mohamed et al. [29] proposed adaptive guided differential evolution algorithm with population size reduction technique which reduces the population-size gradually. Furthermore, GSK algorithm is a population-based optimization algorithm, and its mechanism depends on the size of the population. Similarly, to enhance the performance of NBGSK algorithm, the linear population size reduction mechanism is applied, that decreases the population size linearly, which is marked as PR-NBGSK. To check the performance of PR-NBGSK, it is employed on the 0-1KP with small and large dimensions and compared the results with NBGSK, binary bat algorithm [28], and different binary versions of particle swarm optimization algorithm [27, 35].

The organization of the paper is as: the second section describes the GSK algorithms; third section describes the proposed novel binary GSK algorithm. The population reduction scheme is elaborated in fourth section and the numerical experiments and their comparison are given in fifth section that is followed by final section which contains the concluding remarks.

GSK algorithm

A constrained optimization problem is formulated as:

$$\begin{aligned} \min f(X);~~X=\left[ x_1,x_2,\ldots ,x_d\right] \end{aligned}$$

s.t.

$$\begin{aligned}&g_{t}(X)\le 0;~ t=1,2,\ldots ,m\\&X\in \left[ L_{k},U_{k}\right] ;~ k=1,2,\ldots ,d, \end{aligned}$$

where, f denotes the objective function; \(X=\left[ x_1,x_2,\ldots ,x_d\right] \) are the decision variables; \(g_{t}(X)\) are the inequality constraints; and \(L_{k},U_{k}\) are the lower and upper bounds of decision variables, respectively, and d represents the dimension of individuals. If the problem is in maximization form, then consider minimization = − maximization.

Table 2 Results of binary junior gaining and sharing stage of Case 2 with \(k_\mathrm{f}=1\)

In the recent years, a novel human-based optimization algorithm, Gaining sharing knowledge-based optimization algorithm (GSK) [30], has been developed. It follows the concept of gaining and sharing knowledge throughout human life-time. GSK mainly relies on the two important stages:

  1. 1.

    Junior gaining and sharing stage (early–middle stage)

  2. 2.

    Senior gaining and sharing stage (middle–later stage).

In the early middle stage or junior gaining and sharing stage, it is not possible to acquire knowledge from social media or friends. An individual gains knowledge from their known persons such as family members, relatives, or neighbours. Due to lack of experience, these people want to share their thoughts or gained knowledge with other people which may or may not be from their networks, and they do not have much experience to differentiate others in good or bad category.

Contrarily, in the middle later years stage or senior gaining sharing stage, individuals gain knowledge from their large networks such as social media friends and colleagues. These people have much experience or great ability to categorize people into good or bad classes. Thus, they can share their knowledge or skills with the most suitable persons, so that they can enhance their skills. The dimensions of junior and senior stages will be calculated and it will depend on the knowledge factor. The process mentioned above of GSK can be formulated mathematically in the following steps:

Fig. 4
figure 4

Pseudocode for NBGSK

Table 3 Results of binary senior gaining and sharing stage of Case 1 with \(k_\mathrm{f}=1\)
Table 4 Results of binary senior gaining and sharing stage of Case 2 with \(k_\mathrm{f}=1\)
Fig. 5
figure 5

Pseudocode for PR-NBGSK

Step 1 In the first step, the number of persons is assumed (number of population size NP). Let \(x_{t}~(t=1,2,\ldots ,\mathrm{NP})\) be the individuals of a population. \(x_{tk}=(x_{t1},x_{t2},\ldots ,x_{td})\), where d is branch of knowledge assigned to an individual. and \(f_{t}~(t=1,2,\ldots ,\mathrm{NP})\) is the corresponding objective function values.

To obtain a starting solution for the optimization problem, the initial population must be obtained. The initial population is created randomly within the boundary constraints as:

$$\begin{aligned} x_{tk}^0=L_{k}+\mathrm{rand}_{k}*\left( U_{k}-L_{k}\right) , \end{aligned}$$
(4)

where \(\mathrm{rand}_{k}\) denotes uniformly distributed random number in the range 0 and 1.

Step 2 At first, the dimensions of junior and senior stage should be computed through the following formula:

$$\begin{aligned} d_\mathrm{junior}= & {} d \times \left( \frac{\mathrm{Gen}^{\mathrm{max}}-G}{\mathrm{Gen}^{\mathrm{max}}}\right) ^{K} \end{aligned}$$
(5)
$$\begin{aligned} d_{\mathrm{senior}}= & {} d-d_{\mathrm{junior}}, \end{aligned}$$
(6)

where \(K(>0)\) denotes the knowledge rate, which governs the experience rate. \(d_{\mathrm{junior}}\) and \(d_{\mathrm{senior}}\) represent the dimension for the junior and senior stages, respectively. \(\mathrm{Gen}^{\mathrm{max}}\) is the maximum number of generations, and G denotes the generation number.

Step 3 Junior gaining sharing knowledge stage: During which the early aged people gain knowledge from their small networks and share their views with the other people who may or may not belong to their group. Thus, individuals are updated through as:

  1. 1.

    According to objective function values, the individuals are arranged in ascending order as:

    $$\begin{aligned} x_{\mathrm{best}},\ldots ,x_{t-1},x_{t},x_{t+1},\ldots ,x_{\mathrm{worst}}. \end{aligned}$$
  2. 2.

    For every \(x_{t}~(t=1,2,\ldots ,\mathrm{NP})\), select the nearest best \((x_{t-1})\) and worst \(x_{t+1}\) to gain the knowledge, and also select randomly \((x_\mathrm{R})\) to share the knowledge. Therefore, to update the individuals, the pseudocode is presented in Fig. 1 in which \(k_\mathrm{f} (>0)\) is the knowledge factor.

Step 4 Senior gaining sharing knowledge stage: This stage comprises the impact and effect of other people (good or bad) on the individual. The updation of the individual can be determined as follows:

  1. 1.

    The individuals are classified into three categories (best, middle, and worst) after sorting individuals into ascending order (based on the objective function values).

    best individual\(=100p\%~(x_{\mathrm{pbest}})\), middle individual\(=d-2~100p\%~(x_{\mathrm{middle}})\), worst individual\(=100p\%~(x_{\mathrm{pworst}})\).

  2. 2.

    For every individual \(x_{t}\), choose two random vectors of the top and bottom \(100p\%\) individual for gaining part and the third one (middle individual) is chosen for the sharing part, where \(p\in [0,1]\) is the percentage of best and worst classes. Therefore, the new individual is updated through the following pseudocode presented in Fig. 2.

The flowchart of GSK algorithm is shown in Fig. 3.

Proposed novel binary GSK algorithm (NBGSK)

To solve problems in binary space, a novel binary gaining sharing knowledge-based optimization algorithm (NBGSK) is suggested. In NBGSK, the new initialization, dimensions of stages, and the working mechanism of both stages (junior and senior gaining sharing stages) are introduced over binary space, and the remaining algorithms remain the same as the previous one. The working mechanism of NBGSK is presented in the following subsections:

Binary initialization

The initial population is obtained in GSK using Eq. (4) and it must be updated using the following equation for binary population:

$$\begin{aligned} x_{tk}^0=\mathrm{round}(\mathrm{rand}(0,1)), \end{aligned}$$
(7)

where the round operator is used to convert the decimal number into the nearest binary number.

Evaluate the dimensions of stages

Before proceding further, the dimensions of junior (\(d_{\mathrm{junior}}\)) and senior (\(d_{\mathrm{senior}}\)) stage should be computed using number of function evaluation (NFE) as:

$$\begin{aligned} d_{\mathrm{junior}}= & {} d \times \left( 1-\frac{\mathrm{NFE}}{\mathrm{MaxNFE}}\right) ^{K} \end{aligned}$$
(8)
$$\begin{aligned} d_{\mathrm{senior}}= & {} d-d_{\mathrm{junior}}, \end{aligned}$$
(9)

where \(K(>0)\) denotes the knowledge rate and it is randomly generated and MaxNFE denotes the maximum number of function evaluations.

Binary junior gaining and sharing step

A binary junior gaining and sharing step is based on the original GSK with \(k_f=1\). The individuals are updated in original GSK using the pseudo code (Fig. 1) which contains two cases. These two cases are defined for binary stage as follows:

Case 1 When \(f(x_\mathrm{R})<f(x_t)\): there are three different vectors \(\left( x_{t-1},x_{t+1},x_\mathrm{R}\right) \), which can take only two values (0 and 1). Therefore, total \(2^3\) combinations are possible, which are listed in Table 1. Furthermore, these eight combinations can be categorized into two different subcases [(a) and (b)] and each subcase has four combinations. The results of every possible combinations are presented in Table 1.

Subcase (a) If \(x_{t-1}\) is equal to \(x_{t+1}\), the result is equal to \(x_\mathrm{R}\).

Subcase (b) When \(x_{t-1}\) is not equal to \(x_{t+1}\), then the result is same as \(x_{t-1}\) by taking − 1 as 0 and 2 as 1.

The mathematical formulation of Case 1 is as follows:

$$\begin{aligned} x_{tk}^{\mathrm{new}}={\left\{ \begin{array}{ll} x_\mathrm{R} ;&{} \mathrm{if} ~x_{t-1}=x_{t+1}\\ x_{t-1} ;&{} \mathrm{if}~ x_{t-1} \ne x_{t+1}. \end{array}\right. } \end{aligned}$$
(10)

Case 2 When \(f(x_R)\ge f(x_t)\): There are four different vectors \((x_{t-1},x_{t},x_{t+1},x_{R})\), which consider only two values (0 and 1). Thus, there are total \(2^4\) combinations possible that are presented in Table 2. Moreover, the 16 combinations can be divided into two subcases [(c) and (d)] in which (c) and (d) have 4 and 12 combinations, respectively.

Subcase (c) If \(x_{t-1}\) is not equal to \(x_{t+1}\), but \(x_{t+1}\) is equal to \(x_{R}\), the result is equal to \(x_{t-1}\).

Subcase (d) If any of the conditions arise \(x_{t-1}=x_{t+1}\ne x_R\) or \(x_{t-1} \ne x_{t+1} \ne x_R\) or \(x_{t-1}=x_{t+1}=x_R\), the result is equal to \(x_t\) by considering − 1 and − 2 as 0, and 2 and 3 as 1.

The mathematical formulation of Case 2 is as:

$$\begin{aligned} x_{tk}^\mathrm{new}={\left\{ \begin{array}{ll} x_{t-1} ;&{} \mathrm{if} ~x_{t-1} \ne x_{t+1}=x_\mathrm{R}\\ x_{t-1} ;&{} \mathrm{Otherwise}. \end{array}\right. } \end{aligned}$$
(11)

Binary senior gaining and sharing stage

The working mechanism of binary senior gaining and sharing stage is same as the binary junior gaining and sharing stage with value of \(k_\mathrm{f}=1\). The individuals are updated in original senior gaining sharing stage using pseudocode (Fig. 2), which contains two cases. The two cases further modified for binary senior gaining sharing stage in the following manner:

Case 1 When \(f\left( x_{\mathrm{middle}}\right) <f(x_t)\): it contains three different vectors \((x_\mathrm{pbest},x_\mathrm{middle},x_\mathrm{pworst})\), and they can assume only binary values (0 and 1), and thus, total eight combinations are possible to update the individuals. These total eight combinations can be classified into two subcases [(a) and (b)] and each subcase contains only four different combinations. Table 3 represents the obtained results of this case.

Subcase (a) If \(x_\mathrm{pbest}\) is equal to \(x_\mathrm{pworst}\), the result is equal to \(x_\mathrm{middle}\).

Subcase (b) On the other hand, if \(x_\mathrm{pbest}\) is not equal to \(x_\mathrm{pworst}\), the results are equal to \(x_\mathrm{pbest}\) with assuming − 1 or 2 equivalent to their nearest binary value (0 and 1, respectively).

Case 1 can be mathematically formulated in the following way:

$$\begin{aligned} x_{tk}^\mathrm{new}={\left\{ \begin{array}{ll} x_\mathrm{middle} ;&{} \mathrm{if} ~x_\mathrm{pbest}=x_\mathrm{pworst}\\ x_\mathrm{pbest} ;&{} \mathrm{if}~ x_\mathrm{pbest} \ne x_\mathrm{pworst}. \end{array}\right. } \end{aligned}$$
(12)

Case 2 When \(f(x_\mathrm{middle})\ge f(x_t)\): it consists of four different binary vectors \(\left( x_\mathrm{pbest},x_\mathrm{middle},x_\mathrm{pworst},x_{t}\right) \), and with the values of each vector, total 16 combination are presented. The 16 combinations are also divided into two subcases [(c) and (d)]. The subcases (c) and (d) further contains 4 and 12 combinations, respectively. The subcases are explained in detail in Table 4.

Subcase (c) When \(x_\mathrm{pbest}\) is not equal to \(x_\mathrm{pworst}\), and \(x_\mathrm{pworst}\) is equal to \(x_\mathrm{middle}\), then the obtained results are equal to \(x_\mathrm{pbest}\).

Subcase (d) If any case arises other than (c), then the obtained results is equal to \(x_{t}\) by taking − 2 and − 1 as 0 and 2 and 3 as 1.

The mathematical formulation of Case 2 is given as:

$$\begin{aligned} x_{tk}^\mathrm{new}={\left\{ \begin{array}{ll} x_\mathrm{pbest} ;&{} \mathrm{if} ~x_\mathrm{pbest} \ne x_\mathrm{pworst}=x_\mathrm{middle}\\ x_{t} ;&{} \mathrm{Otherwise}. \end{array}\right. } \end{aligned}$$
(13)

The pseduocode for NBGSK is shown in Fig. 4.

Population reduction on NBGSK (PR-NBGSK)

As the population size is one of the most important parameters of optimization algorithm, it may not be fixed throughout the optimization process. For exploration of the solutions of optimization problem, first, the number of population size must be large, but to rectify the quality of solutions and enhance the performance of algorithm, decrement in the population size is required.

Mohamed et al. [29] used non-linear population reduction formula for differential evolution algorithm to solve the global mumerical optimization problem. Based on the formula, we used the following framework to reduce the population size gradually:

(14)

where \(\mathrm{NP}_{G+1}\) denotes the modified (new) population size in next generation, \(\mathrm{NP}_\mathrm{min}\) and \(\mathrm{NP}_\mathrm{max}\) are the minimum and maximum population size, respectively, NFE is current number of function evaluation, and Max NFE is the assumed maximum number of function evaluations. Taking into consideration \(\mathrm{NP}_\mathrm{min}\) is assumed as 12, we need at least two elements in best and worst partitions. The main advantage to apply population reduction technique to NBGSK is to discard the infeasible or worst solutions from the initial phase of the optimization process without influencing the exploration capability. In the later stage, it emphasizes the exploitation tendency by deleting the worst solutions from the search space.

Note: in this study, population size reduction technique is combined with proposed NBGSK, which is named as PR-NBGSK and the pseudocode for PR-NBGSK is drawn in Fig. 5.

Table 5 Numerical values in PR-NBGSK and NBGSK
Table 6 Data for small-scale problems \(F_1{-}F_{10}\)

Numerical experiments and comparisons

To investigate the performance of proposed algorithms PR-NBGSK and NBGSK, 0-1KP are considered. The first set consists of 10 small scale problems which are taken from the literature [32] and the second one composed with 10 large scale problems.

First, to solve the constrained optimization problem, different types of constraint handling techniques are used [10, 26]. Deb introduced an efficient constraint handling technique which is based on the feasibility rules [13]. Most commonly used approach to handle the constraints is penalty function method, in which the infeasible solutions are punished with some penalty to violate the constraints. Bahreininejad [4] introduced ALM for the water cycle algorithm and solved the real-world problems. In ALM, a constrained optimization problem is converted into an unconstrained optimization problem with some penalty to the original objective function. The original optimization problem is transformed into the following unconstrained optimization problem:

$$\begin{aligned} \max = f(X)+\delta \sum _{t=1}^{m} \left\{ g_{t}(X)\right\} ^2- \lambda \sum _{t=1}^{m} \left\{ g_{t}(X)\right\} , \end{aligned}$$
(15)
Table 7 Results of small-scale 0-1KP
Table 8 Average computational time taken by all optimizers for small-scale problems

where f(X) is the original objective function given in the problem, \(\delta \) is quadratic penalty parameter, \(\sum _{t=1}^{m} \{g_{t}(X)\}^2\) represents the quadratic penalty term, and \(\lambda \) is the Lagrange multiplier.

The ALM is similar to the penalty approach method in which the penalty parameter is chosen as large as possible. In ALM, \(\delta \) and \(\lambda \) are chosen in such a way that \(\lambda \) can remain small to maintain the strategic distance from ill condition. The advantage of ALM is that it decreases the chances of ill conditioning that happened in the penalty approach method.

After applying the ALM to the constrained optimization problems, the problems are solved and compared with binary bat algorithm [28], V-shape transfer function used in PSO (VPSO) [27], S-shaped transfer function used in PSO (SPSO) [27], probability binary PSO (BPSO) [35], and the algorithms run on a personal computer Inter Core\(^\mathrm{TM}\) i5 @ 2.50GHz with 4 GB RAM on MATLAB R2015a. The parameters values used in NBGSK and PR-NBGSK are given in Table 5.

Small-scale problems

This section contains low-dimensional 0-1KP and the details of every problem are presented in Table 6, in which first two columns represent the name of problem and the dimensions respectively. Profits \(p_\mathrm{k}\), weights \(w_\mathrm{k}\), and the capacity of knapsack \(w_\mathrm{max}\) are given in the third column of Table 6.

These problems \(F_{1}{-}F_{10}\) are taken from the literature and were solved using different algorithms to get the optimal solution. The problems \(F_{1}\) and \(F_{2}\) were solved by novel global harmony search algorithm [42] and the obtained optimal objective values are 295 and 1024, respectively.

Sequential combination tree algorithm was proposed by An and Fu [8] to solve the knapsack problem \(F_{3}\) and the obtained optimal solution of \(F_{3}\) is 35 at (1,1,0,1). This method is applicable only for low-dimensional problems.

The knapsack problem \(F_{4}\) is solved using greedy-policy based algorithm [38] and the optimal objective value is 23 at (0,1,0,1).

To solve the knapsack problem \(F_{5}\) with 15 decision variables, Yoshizawa and Hashimoto [37] applied the information of search space landscape and found the optimal objective value is 481.0694.

A method developed by Fayard and Plateau [14] was applied on \(F_{6}\) and obtained optimal solution as 50 at (0,0,1,0,1,1,1,1,0,0).

Fig. 6
figure 6

Box Plot for NFE used in ten problems of PR-NBGSK

Fig. 7
figure 7

The convergence graph for small-scale 0-1KP

Table 9 Data for large-scale 0-1KP

The knapsack problem \(F_{7}\) is solved using non-linear dimensionality reduction method by Zhao [20] and found the optimal solution as 107 and \(F_{8}\) is solved by NGHS and found the optimal solution as 9767 [42].

The optimal solution found by DNA [41] algorithm of \(F_{9}\) is 130 at (1,1,1,1,0).

This problem is taken from the literature and the problem is solved by NGHS [42] and found the optimal solution as 1025.

The solutions of above ten problems are obtained by PR-NBGSK and NBGSK algorithms, and to compare the results, the problems are solved by four state-of-the-art algorithms BBA, VPSO, SPSO, and BPSO.

Each algorithm performs over 50 independent runs and the obtained results are presented in Table 7 with the best, worst, average objective value, number of function evaluations, and success rate of each algorithm.

The comparison is conducted on maximum number of function evaluations (NFE) used in each algorithms and the success rate (SR) of finding the optimal solutions in 50 runs. From Table 7, it can be seen that NBGSK and PR-NBGSK both provides exact solutions for each problem \((F_{1}{-}F_{10})\). The SR of PR-NBGSK for every problem is 100%, whereas the mentioned algorithms SPSO, BPSO, and BBA also have less than 10% SR. Moreover, PR-NBGSK used very less number of function evaluations (presented in bold text), from the Table 7, 6 out of 10 problems \((F_{1},~F_{3},~F_{4},~F_{6},~F_{7},~F_{9})\) PR-NBGSK used less than 1000 number of function evaluations, whereas the other algorithms used 10,000 NFE in most of the problems. Table 8 shows the average computational time taken by all algorithms . It describes that PR-NBGSK algorithm takes least computational time as compared to other algorithms. PR-NBGSK algorithm shows less time in solving 7 problems out of 10 problems. Figure 6 shows the box plot for NFE used in solving 10 knapsack problems by PR-NBGSK, which indicates that, over 50 runs, PR-NBGSK is able to find the optimal solution without more oscillations in NFE. Figure 7 presents the convergence graph of all algorithms for each problem, which shows that PR-NBGSK converges to the optimal solution in less NFE as compared to other algorithms. Therefore, PR-NBGSK and NBGSK have fast convergence speed to get the optimal solution as compared to the other state-of-the-art algorithms.

Large-scale problems

In the previous subsection, we have considered only low-dimensional 0-1KP which seems very easy to evaluate. Therefore, this part contains large-scale 0-1KP with randomly generated data. The data for 10 knapsack problems are generated randomly with the following information [36]: profit \(p_\mathrm{k}\) is between 50 and 100; weights \(w_\mathrm{k}\) is random integer between 5 and 20. The capacity and dimensions of the problems with maximum number of function evaluations are displayed in Table 9.

Table 10 Results of large-scale 0-1KP

As the dimension of problems increases, the problems become more complex. The problems \(F_{11}{-}F_{20}\) are solved using PR-NBGSK, NBGSK, BBA, VPSO, SPSO, and BPSO, and each algorithm performs over 30 independent runs. The obtained solutions of every problems are given in Table 10 with best, worst, average objective value, and their standard deviation. From Table 10, it can be observed that PR-NBGSK acquires the overwhelming performance over the other algorithms and presented the best objective value (bold text) in all problems. Besides, it can be easily observed from Table 9 that the results provided by NBGSK are better than all results provided by compared algorithms in all problems. BBA algorithm presents the worst results among all algorithms with high standard deviation, and it can be concluded that BBA is not suitable for these high-dimensional knapsack problems.

The box plots are displayed in Fig. 8 for all algorithms which demonstrates that the best, worst, and mean solutions obtained by PR-NBGSK are much better than the solutions of other compared algorithms. It also depicts that there is no disparity among the objective values in each run. It can be obviously seen from Table 10 that the standard deviations provided by both PR-NBGSK and NBGSK algorithms are very smaller than the standard deviations provided by other compared algorithm. However, the smallest standard deviation is provided by PR-NBGSK which proves the robustness of the algorithm. While, the other algorithms have more disparity between their objective value except NBGSK algorithm. Moreover, the average computational time taken by all algorithms has been calculated for all problems. Table 11 presents that PR-NBGSK algorithm takes very less time to solve large-scale problems. It has been observed that BBA algorithm consumes lot of time and as compared to other algorithms. VPSO and BPSO algorithms present good results in case of computational time; however, PR-NBGSK algorithm performs better in most of the problems. The convergence graph of all algorithms are drawn in Fig. 9 for illustrating the performance of algorithms. From the figures, it can be noticed that both PR-NBGSK and NBGSK algorithms converge to the best solution as compared to other algorithms in all problems. Although the state-of-the-art algorithms converge faster than PR-NBGSK and NBGSK, they either prematurely converge or they are stagnated at early stage of the optimization process. Thus, it can be concluded that both PR-NBGSK and NBGSK are able to balance between the two contradictory aspects exploration capability and exploitation tendency.

Statistical analysis

To investigate the solution quality and the performance of the algorithms statistically [19], two non-parametric statistical hypothesis tests are conducted: Friedman test and multi-problem Wilcoxon signed-rank test.

In the Friedman test, the final rankings are obtained for different algorithms of all problems. The null hypothesis states that There is no significant difference among the performance of all algorithms, whereas the alternative hypothesis is There is a significant difference among the performance of all algorithms. The decision is made on the obtained p value; when the obtained p value is less than or equal to the assumed significance level 0.05, the null hypothesis is being rejected.

Multi-problem Wilcoxon signed-rank test was used to check the differences between all algorithms for all problems. It considers that \(S^{+}\) denotes the sum of ranks for all problems which describes the first algorithm performs better than the second one in a row, and \(S^{-}\) indicates the opposite of previous one. Larger the ranks indicate the larger performance discrepancy. The null hypothesis of this test narrates that There is no significant difference between the mean results of two sample and the alternative hypothesis is There is a significant difference between the mean results of two samples.

Fig. 8
figure 8

Box plot for objective function value of large-scale 0-1KP

Table 11 Average computational time taken by all optimizers for large-scale problems

The three signs \(+,~-,~\approx \) are assigned to compare the performance of two algorithms and described the following:

Plus \((+)~:\) The results from the first algorithm are significantly better than the second one.

Minus \((-)~:\) The results from the second algorithm are significantly worse than the second one.

Approximate \((\approx )~:\) There is no significant difference between the two algorithms.

The p value is used for comparison and rejection of the null hypothesis that concludes; the null hypothesis is rejected if the obtained p value is less than or equal to the assumed significance level (5%).

In the following results, the p values are shown in bold, and the test are performed in SPSS 20.00. Table 12 lists the ranks according to the Friedman test. We can see that p value computed through the Friedman test is less than 0.05. Thus, we can conclude that there is a significant difference between the performances of the algorithms. The best rank was for PR-NBGSK, SLC, ABHS, and NGHS algorithms followed by NBGSK, respectively.

Table 13 summarizes the statistical analysis results of applying multiple-problem Wilcoxon’s test between PR-NBGSK and other compared algorithms for \(F_{1}{-}F_{10}\) problems. From Table 13, we can see that PR-NBGSK obtains higher \(S^+\) values than \(S^{-}\) in all the cases with exception to SLC, ABHS, and NGHS, where \(S^+\) and \(S^-\) are zero. Precisely, we can draw the following conclusions: PR-NBGSK outperforms SPSO, BHS, and BBA significantly in all functions. Thus, according to the Wilcoxon’s test at \(\alpha =0.05\), the significance difference can be observed in 3 cases out of 9, which means that PR-NBGSK is significantly better than 3 algorithms out of 9 algorithms on 10 test functions at \(\alpha =0.05\). Alternatively, to be more precise, it is obvious from Table 13 that PR-NBGSK is inferior to, equal to, superior to other algorithms in 0, 63, and 27 out of the total 90 cases. Thus, it can be concluded that the performance of PR-NBGSK is almost better than the performance of compared algorithms in 30% of all cases, and it has the same performance as other compared algorithms in 70% of all problems.

Table 14 lists the ranks according to Friedman test. We can see that p value computed through Friedman test is less than 0.05. Thus, we can conclude that there is a significant difference between the performances of the algorithms. The best rank was for PR-NBGSK followed by NBGSK, respectively.

Fig. 9
figure 9

The convergence graph for large-scale 0-1KP

Table 12 Results of Friedman test for all algorithms across \(F_{1}{-}F_{10}\) problems
Table 13 Wilcoxon test against PR-NBGSK for \(F_{1}{-}F_{10}\)
Table 14 Results of Friedman test for all algorithms across \(F_{11}{-}F_{20}\) problems
Table 15 Wilcoxon test against PR-NBGSK for \(F_{11}{-}F_{20}\)

Table 15 summarizes the statistical analysis results of applying multiple-problem Wilcoxon’s test between PR-NBGSK and other compared algorithms for \(F_{11}{-}F_{20}\) problems. From Table 15, we can see that PR-NBGSK obtains higher \(S^+\) values than \(S^-\) in all the cases. Precisely, we can draw the following conclusions: PR-NBGSK outperforms all algorithms significantly in all problems. Thus, according to the Wilcoxon’s test at \(\alpha =0.05\), the significance difference can be observed in all five cases, which means that PR-NBGSK is significantly better than the five algorithms on ten test problems at \(\alpha =0.05\). Alternatively, to be more precise, it is obvious from Table 15 that PR-NBGSK is inferior to, equal to, superior to other algorithms in 0, 0, 50 out of the total 50 cases. Thus, it can be concluded that the performance of PR-NBGSK is better than the performance of compared algorithms in 100% of all cases. Accordingly, it can be deduced from these comparisons that the superiority of the PR-NBGSK algorithm against the compared algorithms increases as the dimensions of the problems increase.

From the above discussion and results, it can be concluded that the proposed PR-NBGSK algorithm has better searching quality, efficiency, and robustness to solve low- and high-dimensional knapsack problems. The PR-NBGSK algorithm shows its overwhelming performance for all problems and proves its superiority from state-of-the-art algorithms. Moreover, the proposed binary junior and senior phase keeps the balance between the two main components of algorithms that is exploration and exploitation abilities and the population reduction rule helps to delete the worst solutions from the search space of PR-NBGSK. Besides, PR-NBGSK is very simple and easy to understand and implement in many languages.

Conclusions

This article presents a significant step and promising approach to solve the complex optimization problems in binary space. A novel binary version of gaining sharing knowledge-based optimization algorithm (NBGSK) is proposed to solve binary combinatorial optimization problems. NBGSK uses two binary vital stages: binary junior gaining and sharing stage and binary senior gaining and sharing stage, which are derived from the original junior and senior stages, respectively. Moreover, to enhance the performance of NBGSK and to get rid of worst and infeasible solutions, population size reduction technique applied to NBGSK and a new variant of NBGSK, i.e., PR-NBGSK is introduced. The proposed algorithms are employed to larger number of instances of 0-1 knapsack problems. The obtained results demonstrates that PR-NBGSK and NBGSK perform better or equal to state-of-the-art algorithms for low-dimensional 0-1 knapsack problems. For high-dimensional problems, PR-NBGSK outperforms the other mentioned algorithms, which also proven by statistical analysis of the solutions. Finally, the convergence graphs and presented box plots show that the PR-NBGSK is superior to other competitive algorithms in terms of convergence, robustness, and ability to find the optimal solutions of 0-1 knapsack problems.

Additionally, for the future research NBGSK and PR-NBGSK algorithms can be applied to multi-dimensional knapsack problems, and also, it may be enhanced by combining novel adaptive scheme for solving real-world problems. The Matlab source code of PR-NBGSK can be downloaded from https://sites.google.com/view/optimization-project/files.