Skip to main content
Log in

Degrees of stochasticity in particle swarm optimization

  • Published:
Swarm Intelligence Aims and scope Submit manuscript

Abstract

This paper illustrates the importance of independent, component-wise stochastic scaling values, from both a theoretical and empirical perspective. It is shown that a swarm employing scalar stochasticity in the particle update equation is unable to express every point in the search space if the problem dimensionality is sufficiently large in comparison with the swarm size. The theoretical result is emphasized by an empirical experiment which shows that a swarm using scalar stochasticity performs significantly worse when the optimum is not in the span of its initial positions. It is also demonstrated that even when the problem dimensionality allows a scalar swarm to reach the optimum, a swarm with component-wise stochasticity significantly outperforms the scalar swarm. This result is extended by considering different degrees of stochasticity, in which groups of components share the same stochastic scalar. It is demonstrated on a large range of benchmark functions that swarms with dimensional coupling (including scalar swarms in the most extreme case) perform significantly worse than a swarm with component-wise stochasticity. The paper also shows that, contrary to previous results in the field, a swarm with component-wise stochasticity is not biased towards the subspace within which it is initialized. The misconception is shown to have arisen in the previous literature due to overzealous normalization when measuring swarm movement, which is corrected in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  • Bratton, D., & Kennedy, J. (2007). Defining a standard for particle swarm optimization. In Proceedings of the IEEE swarm intelligence symposium (pp. 120–127). IEEE Computer Society.

  • Chen, S., Montgomery, J., & Bolufé-Röhler, A. (2015). Measuring the curse of dimensionality and its effects on particle swarm optimization and differential evolution. Applied Intelligence, 42(3), 514–526.

    Article  Google Scholar 

  • Cleghorn, C. W., & Engelbrecht, A. P. (2018). Particle swarm stability: A theoretical extension using the non-stagnate distribution assumption. Swarm Intelligence, 12(1), 1–22.

    Article  Google Scholar 

  • Clerc, M., & Kennedy, J. (2002). The particle swarm—Explosion, stability, and convergence in a multidimensional complex space. IEEE Transactions on Evolutionary Computation, 6(1), 58–73.

    Article  Google Scholar 

  • Eberhart, R., & Kennedy, J. (1995). A new optimizer using particle swarm theory. In Proceedings of the 6th international symposium on micro machine and human science (pp. 39–43).

  • Engelbrecht, A. P. (2013). Particle swarm optimization: Global best or local best? In Proceedings of the BRICS Congress on Computational Intelligence and 11th Brazilian Congress on Computational Intelligence (BRICS-CCI CBIC) (pp. 124–135).

  • Engelbrecht, A. P. (2014). Fitness function evaluations: A fair stopping condition? In Proceedings of the ieee symposium on swarm intelligence (pp. 1–8).

  • Feng, Q., Cai, H., Li, F., Liu, X., Liu, S., & Xu, J. (2019). An improved particle swarm optimization method for locating time-varying indoor particle sources. Building and Environment, 147, 146–157.

    Article  Google Scholar 

  • Fu, H., Li, Z., Liu, Z., & Wang, Z. (2018). Research on Big Data digging of hot topics about recycled water use on micro-blog based on particle swarm optimization. Sustainability, 10(7), 1–15.

    Article  Google Scholar 

  • García-Gonzalo, E., & Fernández-Martínez, J. L. (2014). Convergence and stochastic stability analysis of particle swarm optimization variants with generic parameter distributions. Applied Mathematics and Computation, 249, 286–302.

    Article  MathSciNet  Google Scholar 

  • Han, F., & Liu, Q. (2014). A diversity-guided hybrid particle swarm optimization based on gradient search. Neurocomputing, 137, 234–240.

    Article  Google Scholar 

  • Jamil, M., & Yang, X.-S. (2013). A literature survey of benchmark functions for global optimization problems. International Journal of Mathematical Modelling and Numerical Optimisation, 4(2), 150–194.

    Article  Google Scholar 

  • Krohling, R.A., & dos Santos Coelho, L. (2006a). Coevolutionary particle swarm optimization using gaussian distribution for solving constrained optimization problems. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 36(6), 1407–1416.

  • Krohling, R.A., & dos Santos Coelho, L. (2006b). Pso-e: Particle swarm with exponential distribution. In 2006 IEEE International Conference on Evolutionary Computation (pp. 1428–1433).

  • Li, H., Yang, D., Su, W., Lü, J., & Yu, X. (2019). An overall distribution particle swarm optimization MPPT algorithm for photovoltaic system under partial shading. IEEE Transactions on Industrial Electronics, 66(1), 265–275.

    Article  Google Scholar 

  • Malan, K., & Engelbrecht, A. P. (2008). Algorithm comparisons and the significance of population size. In Proceedings of the IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence) (pp. 914–920).

  • Oldewage, E. T., Engelbrecht, A. P., & Cleghorn, C. W. (2017). The merits of velocity clamping particle swarm optimisation in high dimensional spaces (pp. 1–8).

  • Oldewage, E.T., Engelbrecht, A.P., & Cleghorn, C.W. (2018). The importance of component-wise stochasticity in particle swarm optimization. In 2018 International conference on swarm intelligence (ANTS) (pp. 264–276).

  • Olorunda, O., & Engelbrecht, A. P. (2008). Measuring exploration/exploitation in particle swarms using swarm diversity. In Proceedings of the IEEE congress on evolutionary computation (pp. 1128–1134).

  • Pandey, S., Wu, L., Guru, S.M., & Buyya, R. (2010). A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments. In 2010 24th IEEE international conference on advanced information networking and applications (pp. 400–407).

  • Paquet, U., & Engelbrecht, A. P. (2007). Particle swarms for linearly constrained optimisation. Fundamental Informatics, 76(1–2), 147–170.

    MathSciNet  MATH  Google Scholar 

  • Parsopoulos, K. E., & Vrahatis, M. N. (2010). Particle swarm optimization and intelligence: Advances and applications. Hershey, PA: IGI Publishing (Information Science Reference—Imprint of)

  • Poole, D. (2011). Linear algebra: A modern introduction (3rd ed.). Canada: Cengage Learning.

    Google Scholar 

  • Ramezani, F., & Lotfi, S. (2012). The modified differential evolution algorithm (MDEA). In: Pan, J.-S., Chen, S.-M., & Nguyen, N. (Eds.) Intelligent information and database systems, volume 7198 of Lecture Notes in Computer Science (pp. 109–118). Berlin: Springer.

  • Shi, Y., & Eberhart, R. (1998). A modified particle swarm optimizer. Proceedings of the IEEE International Conference on Evolutionary Computation (pp. 69–73).

  • Sun, Y., Gao, Y., & Shi, X. (2019). Chaotic multi-objective particle swarm optimization algorithm incorporating clone immunity. Mathematics, 7(2), 146.

    Article  Google Scholar 

  • van Zyl, E. T., & Engelbrecht, A. P. (2016). Group-based stochastic scaling for PSO velocities. In Proceedings of the IEEE congress on evolutionary computation (CEC) (pp. 1862–1868).

  • Yoshida, H., Kawata, K., Fukuyama, Y., Takayama, S., & Nakanishi, Y. (2000). A particle swarm optimization for reactive power and voltage control considering voltage security assessment. IEEE Transactions on Power Systems, 15(4), 1232–1239.

    Article  Google Scholar 

  • Zahara, E., Kao, Y. T., & Su, J.R. (2009). Enhancing particle swarm optimization with gradient information. In 5th International conference on natural computation (Vol. 3, pp. 251–254).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. P. Engelbrecht.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Appendix proves Proposition 2, which is restated below for completeness.

Proposition 2

Suppose \({\mathcal {I}}\) contains m linearly independent vectors. Let the search space, S, be an n-dimensional hypercube with lower bound L and upper bound U in all dimensions, i.e.\(S = [L, U]^n\). If \(m < n\), then \(\textit{span}({\mathcal {I}}) \cap S \subsetneq S\). Thus, \({\mathcal {I}}\) can only be a spanning set of S if it contains at least n linearly independent elements.

It is a fundamental result of linear algebra that any two bases for some subspace of \({\mathbb {R}}^n\) must contain the same number of vectors (Poole 2011). Observe that if the search space, S, is a subspace of \({\mathbb {R}}^n\), then the standard unit vectors

$$\begin{aligned} {\mathbf {e}}_1, {\mathbf {e}}_2, \ldots , {\mathbf {e}}_n =<1, 0, 0, \ldots , 0>^T,<0, 1, 0, 0, \ldots>^T, \ldots , <0, \ldots , 0, 1>^T \end{aligned}$$
(19)

form a basis for S. Thus, for \({\mathcal {I}}\) to span S, it must contain at least n linearly independent vectors. However, the search space \(S = [L, U]^n\) does not constitute a subspace of \({\mathbb {R}}^n\), since it is not closed under addition and scalar multiplication. Proposition 2 proves the required result for this special case. The proof requires the use of the fundamental theorem of invertible matrices (given below from Poole (2011)) and Lemma 1.

Theorem 1

(Fundamental Theorem of Invertible Matrices (Poole 2011))

Let A be an n-by-m matrix. Then, the following statements are equivalent:

  1. 1.

    A is invertible

  2. 2.

    \(A{\mathbf {x}} = {\mathbf {0}}\) has only the trivial solution

  3. 3.

    The column vectors of A are linearly independent

  4. 4.

    The column vectors of A span \({\mathbb {R}}^n\)

Lemma 1

Let \(S = [L, U]^n\). There exists a set \({\tilde{E}}\) that forms a “basis” for S, in the sense that \({\tilde{E}}\)’s elements are linearly independent, any \({\mathbf {x}} \in S\) can be expressed as a linear combination of elements in S and \({\tilde{E}} \subset S\).

Proof

Let the centre of the search space be denoted by

$$\begin{aligned} {\mathbf {M}} = <m_1, m_2, \ldots , m_n>^T = \left\langle \frac{L + U}{2}, \frac{L + U}{2}, \ldots , \frac{L + U}{2}\right\rangle ^T \end{aligned}$$
(20)

Since the search space is the same in every dimension, \({\mathbf {M}} = <c, c, \ldots , c>^T\) where \(c = \frac{L+ U}{2}\).

Let \({\tilde{E}} = \{ \tilde{{\mathbf {e}}}_1, \tilde{{\mathbf {e}}}_2, \ldots , \tilde{{\mathbf {e}}}_n \}\) where

$$\begin{aligned} \tilde{{\mathbf {e}}}_1= & {} lt;c + 0.5c, c, \ldots , c>^T \nonumber \\ \tilde{{\mathbf {e}}}_2= & {} lt;c, c + 0.5c, \ldots , c>^T \nonumber \\&\ldots \nonumber \\ \tilde{{\mathbf {e}}}_n&= <c, c, \ldots , c + 0.5c>^T \end{aligned}$$
(21)

so that for any \(\tilde{{\mathbf {e}}}_i\), all coordinates except the ith coordinate are equal to c. The ith coordinate will be equal to 1.5c. Clearly, \({\tilde{E}} \subset S\). Let A be the n by n matrix with column vectors \( \tilde{{\mathbf {e}}}_1, \tilde{{\mathbf {e}}}_2, \ldots , \tilde{{\mathbf {e}}}_n\).

We prove that \(A {\mathbf {x}} = {\mathbf {0}}\) has only the trivial solution. Then, by Theorem 1, it will follow that:

  1. 1.

    A’s column vectors are linearly independent. In other words, \({\tilde{E}}\) is linearly independent.

  2. 2.

    The column vectors of A span \({\mathbb {R}}^n\), which means that E spans \({\mathbb {R}}^n\). Since \(S \subset {\mathbb {R}}^n\), any element in S can thus be expressed as a linear combination of elements from E.

It will now be proved that \(A {\mathbf {x}} = {\mathbf {0}}\) has only the trivial solution, thereby proving the required properties of \({\tilde{E}}\). The system in \(A {\mathbf {x}} = {\mathbf {0}}\) can be written as a system of linear equations illustrated as follows:

$$\begin{aligned} (c + 0.5c) x_1+ & {} c x_2+ & {} \ldots+ & {} c x_n= & {} 0 \nonumber \\ c x_1+ & {} (c +0.5c) x_2+ & {} \ldots+ & {} c x_n= & {} 0 \nonumber \\&&\vdots \nonumber \\ c x_1+ & {} c x_2+ & {} \ldots+ & {} (c +0.5c) x_n= & {} 0, \end{aligned}$$
(22)

where the jth equation is formed from the j component of the system \(A {\mathbf {x}} = {\mathbf {0}}\). In turn, the system of linear equations can be represented in augmented matrix form as follows:

$$\begin{aligned} \left( \begin{array}{cccccc} 1.5c &{} c &{} c &{} \ldots &{} c &{} 0\\ c &{} 1.5c &{} c &{} \ldots &{} c &{} 0\\ \ldots &{} \ldots &{} \ldots &{}\ldots &{}\ldots &{} \ldots \\ c &{} c &{} c &{} \ldots &{} 1.5c &{} 0 \end{array}\right) \end{aligned}$$
(23)

where the ith column contains the coefficients of \(x_i\). The final column contains the right-hand side of all the equations in the linear system (in this case, the zero vector). The augmented matrix is manipulated by means of matrix row operations, which take one of the following forms:

  1. 1.

    Switch the ith and jth rows (denoted by \( R_i \leftrightarrow R_j\), where \(R_i\) denotes the ith row).

  2. 2.

    Multiply the ith row by c, a nonzero constant (denoted by \(R_i \rightarrow c R_i\)).

  3. 3.

    Add row j to row i (denoted by \(R_i \rightarrow R_i + R_j\)).

From these basic operations, more complex operations can be constructed such as subtracting one row from another (\(R_i \rightarrow R_i - R_j\)). For the sake of brevity, the notation \(\forall _i R_i \rightarrow c R_i\) denotes that for each row i, the row is multiplied by a constant c. Similarly, \(\forall _i R_i \rightarrow R_i - R_j\) indicates that the jth row is subtracted from each of the other rows in turn. The system \(A {\mathbf {x}} = {\mathbf {0}}\) will now be solved as follows.

$$\begin{aligned} \left( \begin{array}{ccccc|c} 1.5c &{} c &{} c &{} \ldots &{} c &{} 0\\ c &{} 1.5c &{} c &{} \ldots &{} c &{} 0\\ \ldots &{} \ldots &{} \ldots &{}\ldots &{}\ldots &{} \ldots \\ c &{} c &{} c &{} \ldots &{} 1.5c &{} 0 \end{array}\right)&\forall _i R_i \rightarrow {\frac{1}{c}R_i} \left( \begin{array}{ccccc|c} 1.5 &{} 1 &{} 1 &{} \ldots &{} 1 &{} 0\\ 1 &{} 1.5 &{} 1 &{} \ldots &{} 1 &{} 0\\ \ldots &{} \ldots &{} \ldots &{}\ldots &{}\ldots &{} \ldots \\ 1 &{} 1 &{} 1 &{} \ldots &{} 1.5 &{} 0 \end{array}\right) \\ \forall _i R_i \rightarrow {R_i - R_n} \left( \begin{array}{ccccc|c} 0.5 &{} 0 &{} 0 &{} \ldots &{} -0.5 &{} 0\\ 0 &{} 0.5 &{} 0 &{} \ldots &{} -0.5 &{} 0\\ \ldots &{} \ldots &{} \ldots &{}\ldots &{}\ldots &{} \ldots \\ 1 &{} 1 &{} 1 &{} \ldots &{} 1.5 &{} 0 \end{array}\right)&\forall _i R_i \rightarrow {2 R_i} \left( \begin{array}{ccccc|c} 1 &{} 0 &{} 0 &{} \ldots &{} -1 &{} 0\\ 0 &{} 1 &{} 0 &{} \ldots &{} -1 &{} 0\\ \ldots &{} \ldots &{} \ldots &{}\ldots &{}\ldots &{} \ldots \\ 1 &{} 1 &{} 1 &{} \ldots &{} 1.5 &{} 0 \end{array}\right) \\ R_n \rightarrow {R_n - R_1 - R_2 - \cdots - R_{n-1}}&\left( \begin{array}{ccccc|c} 1 &{} 0 &{} 0 &{} \ldots &{} -1 &{} 0\\ 0 &{} 1 &{} 0 &{} \ldots &{} -1 &{} 0\\ \ldots &{} \ldots &{} \ldots &{}\ldots &{}\ldots &{} \ldots \\ 0 &{} 0 &{} 0 &{} \ldots &{} 1.5 - (-1)(n-1) &{} 0 \end{array}\right) \end{aligned}$$

Therefore, \(x_n (n + 0.5) = 0\). Since \(n \ge 1\), the equation can only have the trivial solution, \(x_n = 0\). But then, for every \(i = 1, \ldots , n-1\),

$$\begin{aligned} x_i - x_n&= 0 \nonumber \\ \implies x_i - 0&= 0 \nonumber \\ \implies x_i&= 0 \end{aligned}$$
(24)

Therefore, \(A {\mathbf {x}} = {\mathbf {0}}\) has only the trivial solution. The existence of the required set \({\tilde{E}}\) is thus proved. \(\square \)

Using Lemma 1, Proposition 2 can be proved as follows.

Proof

(Proposition 2) By Lemma 1, there exists a set \({\tilde{E}}\) that forms a “basis” for S. The set \({\tilde{E}}\) will be used to prove that if \({\mathcal {I}}\) contains m linearly independent vectors and \(m < n\), then \({\mathcal {I}}\) spans only a subset of S.

Let \(m < n\). Towards a contradiction, suppose that \({\mathcal {I}}\) and \({\tilde{E}}\) both form “bases” for S. In other words, \(\textit{span}({\mathcal {I}}) \cap S\) = \(\textit{span}({\tilde{E}}) \cap S = S\).

Consider the equation,

$$\begin{aligned} c_1 \tilde{{\mathbf {e}}}_1 + c_2 \tilde{{\mathbf {e}}}_2 + \cdots + c_n \tilde{{\mathbf {e}}}_n = 0, \end{aligned}$$
(25)

where \(\tilde{{\mathbf {e}}}_1, \tilde{{\mathbf {e}}}_2, \ldots , \tilde{{\mathbf {e}}}_n \in {\tilde{E}}\) and \(c_1, c_2, \ldots , c_n \in {\mathbb {R}}\). Since each \(\tilde{{\mathbf {e}}}_i \in S\) and any element in S can be expressed as a linear combination of vectors in \({\mathcal {I}}\), each \(\tilde{{\mathbf {e}}}_i\) can be written as

$$\begin{aligned} \begin{array}{c} \tilde{{\mathbf {e}}}_1 = a_{11} {\mathbf {z}}_1 + a_{12} {\mathbf {z}}_2 + \cdots + a_{1m} {\mathbf {z}}_m \\ \vdots \\ \tilde{{\mathbf {e}}}_i = a_{i1} {\mathbf {z}}_1 + a_{i2} {\mathbf {z}}_2 + \cdots + a_{im} {\mathbf {z}}_m \\ \vdots \\ \tilde{{\mathbf {e}}}_n = a_{n1} {\mathbf {z}}_1 + a_{n2} {\mathbf {z}}_2 + \cdots + a_{nm} {\mathbf {z}}_m \\ \end{array} \end{aligned}$$

Thus, Eq. (25) can be rewritten as follows:

$$\begin{aligned}&c_1 (a_{11} {\mathbf {z}}_1 + a_{12} {\mathbf {z}}_2 + \cdots + a_{1m} {\mathbf {z}}_m) + \cdots \nonumber \\&\qquad +\, c_i (a_{i1} {\mathbf {z}}_1 + a_{i2} {\mathbf {z}}_2 + \cdots + a_{im} {\mathbf {z}}_m) + \cdots \nonumber \\&\qquad + c_n (a_{n1} {\mathbf {z}}_1 + a_{n2} {\mathbf {z}}_2 + \cdots + a_{nm} {\mathbf {z}}_m ) \end{aligned}$$
(26)
$$\begin{aligned}&\quad = (c_1 a_{11} + c_2 a_{21} + \cdots + c_n a_{n1}) {\mathbf {z}}_1 + \cdots \nonumber \\&\qquad +\, (c_1 a_{1j} + c_2 a_{2j} + \cdots + c_n a_{nj}) {\mathbf {z}}_j + \cdots \nonumber \\&\qquad +\, (c_1 a_{1m} + c_2 a_{2m} + \cdots + c_n a_{nm}) {\mathbf {z}}_m \nonumber \\&\quad = \sum _{i=1}^n c_i a_{i1} {\mathbf {z}}_1 + \sum _{i=1}^n c_i a_{i2} {\mathbf {z}}_2 +\cdots + \sum _{i=1}^n c_i a_{im} {\mathbf {z}}_m \end{aligned}$$
(27)

Now, \({\mathbf {z}}_1, {\mathbf {z}}_2, \ldots , {\mathbf {z}}_m\) are linearly independent. Therefore, Eq. (27) has only the trivial solution and \(\sum _{i=1}^n c_i a_{ij}\) must equal zero for all \(j = 1, \ldots , m\). This can be written as a homogeneous system of m equations, each with n variables, \(c_1, \ldots , c_n\). Since \(m < n\), there are more variables than equations, so there must be infinitely many solutions. Particularly, there must be a non-trivial solution. But, this gives a non-trivial dependence relation in Eq. (25). By definition, \({\tilde{E}}\) must thus be a linearly dependent set of vectors. But this is a contradiction, since \({\tilde{E}}\) is linearly independent. Therefore, \({\mathcal {I}}\) spans a strict subset of S. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oldewage, E.T., Engelbrecht, A.P. & Cleghorn, C.W. Degrees of stochasticity in particle swarm optimization. Swarm Intell 13, 193–215 (2019). https://doi.org/10.1007/s11721-019-00168-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11721-019-00168-9

Keywords

Navigation