A human learning optimization algorithm with competitive and cooperative learning

Du, JiaoJie; Wang, Ling; Fei, Minrui; Menhas, Muhammad Ilyas

doi:10.1007/s40747-022-00808-4

A human learning optimization algorithm with competitive and cooperative learning

Original Article
Open access
Published: 04 August 2022

Volume 9, pages 797–823, (2023)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

A human learning optimization algorithm with competitive and cooperative learning

Download PDF

JiaoJie Du¹,
Ling Wang¹,
Minrui Fei¹ &
…
Muhammad Ilyas Menhas²

1464 Accesses
3 Citations
Explore all metrics

Abstract

Human learning optimization (HLO) is a simple yet powerful metaheuristic developed based on a simplified human learning model. Competition and cooperation, as two basic modes of social cognition, can motivate individuals to learn more efficiently and improve their efficiency in solving problems by stimulating their competitive instincts and increasing interaction with each other. Inspired by this fact, this paper presents a novel human learning optimization algorithm with competitive and cooperative learning (HLOCC), in which a competitive and cooperative learning operator (CCLO) is developed to mimic competition and cooperation in social interaction for enhancing learning efficiency. The HLOCC can efficiently maintain the diversity of the algorithm as well as achieve the optimal values, demonstrating that the proposed CCLO can effectively improve algorithm performance. HLOCC has been compared with other heuristic algorithms on CEC2017 functions. In the second study, the uncapacitated facility location problems (UFLPs) which are one of the pure binary optimization problems are solved with HLOCC. The experimental results show that the developed HLOCC is superior to previous HLO variants and other metaheuristics with its improved exploitation and exploration abilities.

An adaptive human learning optimization with enhanced exploration–exploitation balance

Article 19 May 2022

Continuous Human Learning Optimization with Enhanced Exploitation

Continuous human learning optimization with enhanced exploitation and exploration

Article 12 December 2023

Introduction

Optimization problems exist widely in the real world, and therefore methods used to solve these problems have been a hot topic. Traditional gradient-based methods (such as the steepest descent method and the Newton method) have been used to solve various optimization problems successfully. However, as the optimization problems come to be increasingly complicated, gradient-based methods are inefficient and inconvenient because they require substantial gradient information, are sensitive to initial values and need a large amount of enumeration memory. In the past few decades, evolutionary computing has become an attractive and effective optimization method for rapidly growing complex modern optimization problems. The optimization approaches inspired by biological systems have attracted considerable interest in recent years, which are quite successful in solving problems in optimal allocation of science and technology resources, industrial automation, economy and other fields.

As is known to all, human beings as the smartest creature in the earth are capable of solving a large number of complicated problems that other living beings, such as birds, ants, and fireflies, cannot tackle. Humans have a powerful learning ability, and the process of human learning is extremely complicated, of which the study is the part of neuropsychology, educational psychology, learning theory and pedagogy [1]. Actually, most of human learning activities are similar to the search process of metaheuristics. Motived by this thought, Wang et al. [2] proposed human learning optimization (HLO) based on a simplified human learning model in which three learning operators, i.e. the random learning operator (RLO), the individual learning operator (ILO), and the social learning operator (SLO), are developed to search out the optimal solution by mimicking the random learning strategy, the individual learning strategy, and the social learning strategy in the learning activities of humans, respectively. The performance of HLO, like other meta-heuristic algorithms, is sensitive to its parameter values. Therefore, to further improve the global search ability and the robustness of HLO, an adaptive simplified human learning optimization (ASHLO) [3] has been proposed. Later, based on the fact that the Intelligence Quotient (IQ) scores followed Gaussian distribution [4], a diverse human learning optimization algorithm [5] is proposed to improve the performance of HLO, in which the Gaussian distribution and dynamic adjustment strategy were introduced to strengthen the robustness of the algorithm. In addition to that, a new adaptive HLO based on sine–cosine functions [6] is developed, which enhances the explore and exploitation abilities of the algorithm periodically by tuning the control parameters with the sine and cosine functions. Recently, a novel adaptive human learning algorithm (IAHLO) [7] is developed to dynamically tune the control parameter of the random learning operator, which can efficiently develop the diversity at the beginning of iteration and perform the accurate local search at the end of search.

Since both individual learning and social learning operations in standard HLO copy their individual optima, and only random learning explores new solutions with a small probability, the algorithm is prone to fall into local optima. Therefore, it is necessary to design a new operation operator to further improve the performance of HLO. For this reasons, when an individual cannot continuously improve his fitness, a relearning operation is designed in [8]. This operator can eliminate the stored knowledge of the individual and make the individual start searching again with a good chance to escape from local optimum. Besides, the hybrid algorithm ASHLO-GA in [9] and HLO-PSO in [10] are proposed to tackle supply chain network design problem and the flexible job-shop scheduling problem (FJSP), respectively. Since binary algorithms are ineffective in solving high-dimensional continuous problems, a continuous HLO is firstly presented and integrated with the binary HLO to solve mixed-variable optimization problems [11]. Besides, a discrete HLO is proposed in [12] and successfully used to solve the production scheduling problem. Now, HLO has been successfully applied to knapsack problems [2, 3, 5, 8], text extraction [13], optimal power flow calculation [14, 15], financial markets forecasting [6], image segmentation [16] and intelligence control [17,18,19].

Nowadays, human learning has being been widely researched in multiple disciplines including computer [20], economics [21], sociology [22], etc. Human beings have social attributes and always interact with each other in the practice of life. Competition and cooperation are the two basic forms of interaction [23]. Actually, many social activities in humans contain both competition and cooperation [24]. Furthermore, competition can lead to higher levels of cooperation [25], and cooperation can further build competitive advantage [26]. Therefore, competition and cooperation can motivate learners to learn more efficiently and improve their efficiency in solving problems. In real life, people find and learn from individuals who are better than themselves by comparing with each other, which is the process of competition and cooperation. Comparing with each other can be seen as a process of competition, while learning from each other is cooperation. Through competition and cooperation, individuals can have more possibilities to learn new knowledge and diversify the group's overall structure [27]. The social learning operation of HLO is designed according to the “copy-the-best” strategy, which can easily lead to the algorithm falling into local optima. However, the introduction of cooperation mechanism can help the algorithm avoid falling into local optimization, because the learning mechanism of competition and cooperation can effectively increase the diversity of the population. Inspired by this learning mechanism, a human learning optimization with competitive and cooperative learning (HLOCC) is proposed in this paper, in which a novel competitive and cooperative learning strategy is developed to improve the balance between exploration and exploitation of HLO.

The rest of the paper is organized as follows. Section “Human learning optimization with competitive and cooperative learning” introduces the idea, operators and implementation of the proposed HLOCC in detail. The parameter study of HLOCC is given in section “Parameter study of HLOCC” to insightful analyze and explain why the developed competitive and cooperative learning operator can enhance the search ability of algorithm. Then the performance of HLOCC is evaluated and compared with recent state-of-art metaheuristics in section “Experimental results and discussions”. Finally, conclusions are drawn in section “Conclusions and future works”.

Human learning optimization with competitive and cooperative learning

As a binary metaheuristic, HLOCC adopts the binary-coding framework, and therefore each individual, i.e. the solution, is composed by a binary string as Eq. (1), in which each bit denotes a basic component of knowledge of problems,

$$ \begin{aligned} & x_{i} = \left[ {x_{i1} \, x_{i2} \cdots x_{ij} \cdots x_{iM} } \right],\\ & x_{ij} \in \left\{ {0,1} \right\},{1} \le i \le N,{1} \le j \le M \end{aligned} $$

(1)

where ${x}_{ij}$ is the j-th bit of the i-th individual, and N and M denote the size of population and the length of solutions, respectively. At the beginning of learning, humans usually have no prior knowledge of problems, thus each individual of HLOCC is initialized with “0” or “1” randomly.

Random learning operator

Random learning always exists in the human learning as usually there is no prior knowledge of new problems [28]. Besides, it is a simple but valid strategy for humans to explore new strategies and improve performance in the progress of learning. To imitate the random learning strategy, the random learning operator (RLO) is used in HLOCC as Eq. (2)

$$ x_{ij} = {\text{RLO}} = \left\{ {\begin{array}{*{20}l} {0,{ 0} \le r_{1} \le 0.5} \hfill \\ {1,{\text{ else}}} \hfill \\ \end{array} } \right. $$

(2)

where ${r}_{1}$ is a stochastic number between 0 and 1.

Individual learning operator

Individual learning [29, 30] is the ability of humans to build up knowledge through individual reflection. By following previous experience, people can avoid mistakes and improve the efficiency and effectiveness of learning. To mimic this learning behavior, L best individual solutions are memorized and stored in the individual knowledge database (IKD) of HLOCC for individual learning, which is defined as Eqs. (3) and (4):

$$ I{\text{KD}} = \left[ {\begin{array}{*{20}c} {{\text{ikd}}_{1} } \\ {{\text{ikd}}_{2} } \\ \vdots \\ {{\text{ikd}}_{i} } \\ \vdots \\ {{\text{ikd}}_{N} } \\ \end{array} } \right]{, 1} \le i \le N $$

(3)

$$ {\text{ikd}}_{i} = \left[ {\begin{array}{*{20}c} {{\text{ikd}}_{i1} } \\ {{\text{ikd}}_{i2} } \\ \vdots \\ {{\text{ikd}}_{ip} } \\ \vdots \\ {{\text{ikd}}_{iL} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {{\text{ik}}_{i1,1} } & {{\text{ik}}_{i1,2} } & \cdots & {{\text{ik}}_{i1,j} } & \cdots & {{\text{ik}}_{i1,M} } \\ {{\text{ik}}_{i2,1} } & {{\text{ik}}_{i2,2} } & \cdots & {{\text{ik}}_{i2,j} } & \cdots & {{\text{ik}}_{i2,M} } \\ \vdots & \vdots & {} & \vdots & {} & \vdots \\ {{\text{ik}}_{ip,1} } & {{\text{ik}}_{ip,2} } & \cdots & {{\text{ik}}_{ip,j} } & \cdots & {{\text{ik}}_{ip,M} } \\ \vdots & \vdots & {} & \vdots & {} & \vdots \\ {{\text{ik}}_{iL,1} } & {{\text{ik}}_{iL,2} } & \cdots & {{\text{ik}}_{iL,j} } & \cdots & {{\text{ik}}_{iL,M} } \\ \end{array} \cdots } \right],\;1 \le p \le L $$

(4)

where ${\mathrm{ikd}}_{i}$ denotes the individual knowledge database of person i, and ${\mathrm{ikd}}_{ip}$ stands for the p-th best solution of person i.

When HLOCC performs the individual learning operator (ILO), the candidate solution learns from a random solution in its IKD as Eq. (5):

$$ x_{ij} = {\text{ik}}_{ip,j} $$

(5)

After a new population is generated, the fitness of all individuals is calculated according to the pre-defined fitness function to update the IKDs. Since HLOCC is designed for solving single-objective problems, the size of IKDs is set to 1 as suggested in previous works on HLO. Therefore, the new candidate replaces the original solution in the IKDs if and only if its fitness value is superior.

Competitive and cooperative learning operator

Competition and cooperation, as two basic modes of social cognition [31], occupy a critically important place in the research into individual as well as both group and societal behavior [32]. The social interdependence theory [33, 34] states that competitive and cooperative learning can significantly improve learning efficiency and make the overall structure of the group more diversified. Because the social learning of standard HLO relies too much on global optimization, the algorithm is easy to fall into local optimization. The competition and cooperation mechanism can increase the diversity of population, so that the algorithm can discover new solution areas. Inspired by these discoveries, a novel competitive and cooperative learning operator (CCLO) is developed to implement competitive and cooperative learning in HLOCC to increase the diversity of the population.

When performing CCLO, the current individual is compared with an individual randomly selected from a population that does not include itself according to the fitness value to determine whether the current individual is a winner or a loser. The winner's individual optimal solution is recognized, while the loser needs to learn from the winner’s experience. During the learning process, if the current individual is the winner, the standard HLO will continue to be followed, otherwise, the competitive and cooperative learning operator will be added to learn the winner’s individual optimal solution. The learning process can be represented by Eq. (6)

$$ x_{ij}^{{{\text{loser}}}} = ik_{kj}^{{{\text{winner}}}} ,k \ne i $$

(6)

where $ik_{k}^{{{\text{winner}}}}$ denotes the IKD of winner individual k. The mechanisms of the competitive and cooperative learning can be described as Fig. 1.

Social learning operator

Social learning [35] plays an important role in social environment because it allows human being to copy the best solutions in the population and accelerate the learning process. Although CCLO is also a social learning operation, the probability of learning the best individual is relatively small and the learning efficiency is low. When problems become extremely complicated and time-consuming, people prefer to learn from the individual with the highest social evaluation value in the population, that is, “copy the best” strategy[36]. It can quickly drive the whole population towards the best optimal solutions. Correspondingly, HLOCC conducts the social learning operator to emulate the “copy the best” behavior of humans, the best knowledge of population is stored in the social knowledge database (SKD) as Eq. (7)

$$ {\text{SKD}} = \left[ {\begin{array}{*{20}c} {{\text{skd}}_{1} } \\ {{\text{sk}}d_{2} } \\ \vdots \\ {{\text{sk}}d_{q} } \\ \vdots \\ {{\text{sk}}d_{H} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {sk_{11} } & {{\text{sk}}_{12} } & \cdots & {{\text{sk}}_{1j} } & \cdots & {{\text{sk}}_{1M} } \\ {{\text{sk}}_{21} } & {{\text{sk}}_{22} } & \cdots & {{\text{sk}}_{2j} } & \cdots & {{\text{sk}}_{2M} } \\ \vdots & \vdots & {} & \vdots & {} & \vdots \\ {{\text{sk}}_{q1} } & {{\text{sk}}_{q2} } & \cdots & {{\text{sk}}_{qj} } & \cdots & {{\text{sk}}_{qM} } \\ \vdots & \vdots & {} & \vdots & {} & \vdots \\ {{\text{sk}}_{H1} } & {{\text{sk}}_{H2} } & \cdots & {{\text{sk}}_{Hj} } & \cdots & {{\text{sk}}_{HM} } \\ \end{array} } \right],\;1 \le q \le H $$

(7)

where ${\mathrm{skd}}_{q}$ denotes the q-th solution in the SKD and H is the size of SKD.

With the knowledge in the SKD, the HLOCC performs the social learning operator (SLO) to generate new candidate solutions as Eq. (8)

$$ x_{ij} = {\text{skd}}_{qj} $$

(8)

the size of SKD in HLOCC is also set to 1, and the new candidate is saved to replace the current one in the SKD only if it has a better fitness value.

Implementation of HLOCC

In summary, when the current individual is the winner, it will use the three learning operators like standard HLO, which can be described as Eq. (9), while if it is the loser, it will perform the random learning operator, the individual learning operator, the competitive and cooperative learning operator and the social learning operator to generate new candidate solutions, which are presented as Eq. (10)

$$ x_{ij}^{{{\text{winner}}}} = \left\{ {\begin{array}{*{20}l} {{\text{RLO}}} &\quad {0 \le r_{2} \le {\text{pr}}} \\ {{\text{ik}}_{kj}^{{{\text{winner}}}} } &\quad {{\text{pr}} \le r_{2} \le {\text{pi}}} \\ {{\text{sk}}_{qj} } &\quad {{\text{else}}} \\ \end{array} } \right. $$

(9)

$$ x_{{ij}}^{{{\text{loser}}}} = \left\{ {\begin{array}{lllll} {{\text{RLO}},} \hfill &\quad {0 \le r_{3} \le {\text{pr}}} \hfill \\ {{\text{ik}}_{{ipj}}^{{{\text{loser}}}} } \hfill &\quad {{\text{pr}} < r_{3} \le {\text{pil}}} \hfill \\ {{\text{ik}}_{{kj}}^{{{\text{winner}}}} } \hfill &\quad {{\text{pr}} < r_{3} \le {\text{pcc}}} \hfill \\ {{\text{sk}}_{{qj}} } \hfill &\quad {{\text{else}}} \hfill \\ \end{array} } \right. $$

(10)

where $r_{2}$ is a stochastic number between 0 and 1; (pi–pr) and (1–pi) represent the probabilities of performing individual learning and social learning of the winner individual, respectively; and $r_{3}$ is a stochastic number between 0 and 1; pr, (pil–pr), (pcc–pil), and (1–pcc) are the probabilities of the random learning, the individual learning, the competitive and cooperative learning, and social learning of the loser individual, respectively.

The implementation of HLOCC is described in Fig. 2.

Parameter study of HLOCC

Analysis of the control parameters

A parameter study was performed in this section to analyze and choose fair control parameter values for HLOCC. For simplicity, the control parameter pr and pi adopt the default value of HLO [2], i.e. 5/M and 0.85 + 2/M, since the learning strategy of the winner in HLOCC is as same as the standard HLO. Then pil and pcc are investigated as they together determine the probability of performing CCLO. The cross-combination method was used, and the alternate values of control parameters pil and pcc are listed in Table 1. Two functions, i.e. F2 and F8 chosen from the CEC17 benchmark functions [37], were adopted to investigate the influence of these two control parameters, and the characteristics of the CEC17 benchmark are given in Table 2. The size of population and the maximum number of iterations on the 10-dimensional functions were set to 50 and 3000, and they were increased to 100 and 6000 on the 30-dimensional functions. Each decision variable was encoded by 30 bits, and each function ran 100 times independently. To choose the optimal parameters combination, the mean value (Mean) are calculated as the performance indicator and shown in Table 3, where the best numbers are in bold. Note that pcc should be bigger than pil in HLOCC, and therefore only 71 cases which meet this requirement are listed in Table 3. The trails on 10-D F2, 10-D F8, 30-D F2 and 30-D F8 rank in the top 10 percent are all selected and the background are in italics. The parameter combination of trail 58 is adapted as the default parameter setting in this paper as it is the only one in the top 10 percent of all the four cases.

Table 1 Parameter values of pil and pcc

A human learning optimization algorithm with competitive and cooperative learning

Abstract

Similar content being viewed by others

An adaptive human learning optimization with enhanced exploration–exploitation balance

Continuous Human Learning Optimization with Enhanced Exploitation

Continuous human learning optimization with enhanced exploitation and exploration

Introduction

Human learning optimization with competitive and cooperative learning

Random learning operator

Individual learning operator

Competitive and cooperative learning operator

Social learning operator

Implementation of HLOCC

Parameter study of HLOCC

Analysis of the control parameters

Role of competitive and cooperative learning

Experimental results and discussions

CEC17 benchmark functions

Low-dimensional functions

High dimensional benchmark functions

The uncapacitated facility location problem

A comparison of HLOCC with HLO and PSO variants

A comparison of HLOCC with the DE and GA variants

A comparison of HLOCC with other state-of-the-are methods

Results on M*

Conclusions and future works

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation