Abstract
Population-based meta-heuristic algorithms are among the dominant algorithms used to solve challenging real world problems in diverse fields. Whale Optimization Algorithm (WOA) is a recent swarm intelligence meta-heuristic algorithm based on the bubble-net feeding behavior of humpback whales. Despite its capability to solve complex optimization problems, WOA requires enormous amount of computations when solving large size problems. This work proposes Spark-WOA, a distributed implementation of WOA on Apache Spark platform to enhance its performance and reduce computational complexity. The proposed algorithm exploits in-memory computations and broadcast features of Apache Spark to provide better performance and scalability. Details of the proposed algorithm are presented and its performance as compared to a recent Apache Hadoop implementation is discussed. Experimental results demonstrated the superiority of the proposed implementation in terms of both speed and scalability.
Similar content being viewed by others
References
Abd El Aziz, M., Ewees, A.A., Hassanien, A.E.: Whale optimization algorithm and moth-flame optimization for multilevel thresholding image segmentation. Expert Syst. Appl. 83, 242–256 (2017)
Alnafessah, A., Casale, G.: Artificial neural networks based techniques for anomaly detection in apache spark. Clust. Comput. 23, 1361–1362 (2020)
Barba-Gonzaléz, C., García-Nieto, J., Nebro, A.J., Aldana-Montes, J.F.: Multi-objective big data optimization with jmetal and spark. In: Proceedings of the International Conference on Evolutionary Multi-Criterion Optimization, pp. 16–30. Springer (2017)
Chen, H., Hu, Z., Han, L., Hou, Q., Ye, Z., Yuan, J., Zeng, J.: A spark-based distributed whale optimization algorithm for feature selection. In: Proceedings of the 2019 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), vol. 1, pp. 70–74. IEEE (2019)
Cheraghchi, F., Iranzad, A., Raahemi, B.: Subspace selection in high-dimensional big data using genetic algorithm in apache spark. In: Proceedings of the Second International Conference on Internet of things, Data and Cloud Computing, pp. 1–7 (2017)
Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Comput. Intell. Mag. 1(4), 28–39 (2006)
García, J., Altimiras, F., Peña, A., Astorga, G., Peredo, O.: A binary cuckoo search big data algorithm applied to large-scale crew scheduling problems. Complexity (2018). https://doi.org/10.1155/2018/8395193
Gharehchopogh, F.S., Gholizadeh, H.: A comprehensive survey: whale optimization algorithm and its applications. Swarm Evol. Comput. 48, 1–24 (2019)
He, F., Wei, P.: Research on comprehensive point of interest (poi) recommendation based on spark. Clust. Comput. 22(4), 9049–9057 (2019)
He, Z., Peng, H., Chen, J., Deng, C., Wu, Z.: A spark-based differential evolution with grouping topology model for large-scale global optimization. Clust. Comput. (2020). https://doi.org/10.1007/s10586-020-03124-z
Holland, J.H.: Genetic algorithms. Sci. Am. 267(1), 66–73 (1992)
Huang, X., Li, C., Chen, H., An, D.: Task scheduling in cloud computing using particle swarm optimization with time varying inertia weight strategies. Clust. Comput. 23, 1137–1147 (2020)
Ilango, S.S., Vimal, S., Kaliappan, M., Subbulakshmi, P.: Optimization using artificial bee colony based clustering approach for big data. Clust. Comput. 22(5), 12169–12177 (2019)
Karaboga, D., Basturk, B.: A powerful and efficient algorithm for numerical function optimization: artificial bee colony (abc) algorithm. J. Global Optim. 39(3), 459–471 (2007)
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN’95-International Conference on Neural Networks, vol. 4, pp. 1942–1948. IEEE (1995)
Khalil, Y., Alshayeji, M., Ahmad, I.: Distributed whale optimization algorithm based on mapreduce. Concurr. Comput. Pract. Exp. 31(1), e4872 (2019)
Kong, F., Lin, X.: The method and application of big data mining for mobile trajectory of taxi based on mapreduce. Clust. Comput. 22(5), 11435–11442 (2019)
Lämmel, R.: Google’s mapreduce programming model—revisited. Sci. Comput. Programm. 70(1), 1–30 (2008)
Li, B., Li, J., Tang, K., Yao, X.: Many-objective evolutionary algorithms: a survey. ACM Comput. Surv. (CSUR) 48(1), 1–35 (2015)
Li, C., Wen, T., Dong, H., Wu, Q., Zhang, Z.: Implementation of parallel multi-objective artificial bee colony algorithm based on spark platform. In: Proceedings of the 2016 11th International Conference on Computer Science & Education (ICCSE), pp. 592–597. IEEE (2016)
Ling, Y., Zhou, Y., Luo, Q.: Lévy flight trajectory-based whale optimization algorithm for global optimization. IEEE Access 5, 6168–6186 (2017)
Lu, H.C., Hwang, F., Huang, Y.H.: Parallel and distributed architecture of genetic algorithm on apache hadoop and spark. Appl. Soft Comput. (2020). https://doi.org/10.1016/j.asoc.2020.106497
Luo, X., Fu, X.: Configuration optimization method of hadoop system performance based on genetic simulated annealing algorithm. Clust. Comput. 22(4), 8965–8973 (2019)
Małysiak-Mrozek, B., Baron, T., Mrozek, D.: Spark-idpp: high-throughput and scalable prediction of intrinsically disordered protein regions with spark clusters on the cloud. Clust. Comput. 22(2), 487–508 (2019)
Manogaran, G., Lopez, D.: A gaussian process based big data processing framework in cluster computing environment. Clust. Comput. 21(1), 189–204 (2018)
Mirjalili, S., Lewis, A.: The whale optimization algorithm. Adv. Eng. Softw. 95, 51–67 (2016)
Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014)
OpenMP Architecture Review Board: OpenMP application program interface version 3.0 (2008). http://www.openmp.org/mp-documents/spec30.pdf
Pham, Q.V., Mirjalili, S., Kumar, N., Alazab, M., Hwang, W.J.: Whale optimization algorithm with applications to resource allocation in wireless networks. IEEE Trans. Veh. Technol. 69(4), 4285–4297 (2020)
Prakash, D.B., Lakshminarayana, C.: Optimal siting of capacitors in radial distribution network using whale optimization algorithm. Alex. Eng. J. 56(4), 499–509 (2017)
Ramírez-Gallego, S., García, S., Benítez, J.M., Herrera, F.: A distributed evolutionary multivariate discretizer for big data processing on apache spark. Swarm Evol. Comput. 38, 240–250 (2018)
Sauber, A.M., Nasef, M.M., Houssein, E.H., Hassanien, A.E.: Parallel whale optimization algorithm for solving constrained and unconstrained optimization problems. arXiv preprint arXiv:1807.09217 (2018)
Sherar, M., Zulkernine, F.: Particle swarm optimization for large-scale clustering on apache spark. In: Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE (2017)
Sunderam, V.S.: PVM: a framework for parallel distributed computing. Concurr. Pract. Exp. 2(4), 315–339 (1990)
Touma, H.J.: Study of the economic dispatch problem on ieee 30-bus system using whale optimization algorithm. Int. J. Eng. Technol. Sci. (IJETS) 5(1), 11–18 (2016)
Watkins, W.A., Schevill, W.E.: Aerial observation of feeding behavior in four baleen whales: eubalaena glacialis, balaenoptera borealis, megaptera novaeangliae, and balaenoptera physalus. J. Mammal. 60(1), 155–163 (1979)
Wen, T., Liu, H., Lin, L., Wang, B., Hou, J., Huang, C., Pan, T., Du, Y.: Multiswarm artificial bee colony algorithm based on spark cloud computing platform for medical image registration. Comput. Methods Progr. Biomed. (2020). https://doi.org/10.1016/j.cmpb.2020.105432
Xiong, F., Gong, P., Jin, P., Fan, J.: Supply chain scheduling optimization based on genetic particle swarm optimization algorithm. Clust. Comput. 22(6), 14767–14775 (2019)
Yang, X.S., Deb, S.: Cuckoo search via lévy flights. In: Proceedings of the 2009 World congress on nature & biologically inspired computing (NaBIC), pp. 210–214. IEEE (2009)
Yang, X.S., He, X.: Nature-inspired optimization algorithms in engineering: overview and applications. In: Yang, X.-S. (ed.) Nature-Inspired Computation in Engineering, pp. 1–20. Springer, New York (2016)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I., et al.: Spark: custer computing with working sets. HotCloud 10(10–10), 95 (2010)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Presented as part of the 9th {USENIX} Symposium on Networked Systems Design and Implementation {NSDI}, vol. 12, pp. 15–28 (2012)
Zaharia, M., Xin, R.S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M.J., et al.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
AlJame, M., Ahmad, I. & Alfailakawi, M. Apache Spark Implementation of Whale Optimization Algorithm. Cluster Comput 23, 2021–2034 (2020). https://doi.org/10.1007/s10586-020-03162-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-020-03162-7