skip to main content
research-article

Energy and SLA-driven MapReduce Job Scheduling Framework for Cloud-based Cyber-Physical Systems

Published:03 May 2021Publication History
Skip Abstract Section

Abstract

Energy consumption minimization of cloud data centers (DCs) has attracted much attention from the research community in the recent years; particularly due to the increasing dependence of emerging Cyber-Physical Systems on them. An effective way to improve the energy efficiency of DCs is by using efficient job scheduling strategies. However, the most challenging issue in selection of efficient job scheduling strategy is to ensure service-level agreement (SLA) bindings of the scheduled tasks. Hence, an energy-aware and SLA-driven job scheduling framework based on MapReduce is presented in this article. The primary aim of the proposed framework is to explore task-to-slot/container mapping problem as a special case of energy-aware scheduling in deadline-constrained scenario. Thus, this problem can be viewed as a complex multi-objective problem comprised of different constraints. To address this problem efficiently, it is segregated into three major subproblems (SPs), namely, deadline segregation, map and reduce phase energy-aware scheduling. These SPs are individually formulated using Integer Linear Programming. To solve these SPs effectively, heuristics based on Greedy strategy along with classical Hungarian algorithm for serial and serial-parallel systems are used. Moreover, the proposed scheme also explores the potential of splitting Map/Reduce phase(s) into multiple stages to achieve higher energy reductions. This is achieved by leveraging the concepts of classical Greedy approach and priority queues. The proposed scheme has been validated using real-time data traces acquired from OpenCloud. Moreover, the performance of the proposed scheme is compared with the existing schemes using different evaluation metrics, namely, number of stages, total energy consumption, total makespan, and SLA violated. The results obtained prove the efficacy of the proposed scheme in comparison to the other schemes under different workload scenarios.

References

  1. Emerson Network Power. [n.d.]. Energy logic: Reducing data center energy consumption by creating savings that cascade across systems. Emerson Network Power. A White Paper from the Experts in Business-Critical Continuity.Google ScholarGoogle Scholar
  2. Carnegie Mellon University. [n.d.]. OpenCloud Hadoop cluster trace: Format and schema. Retrieved from http://ftp.pdl.cmu.edu/pub/datasets/hla/dataset.html.Google ScholarGoogle Scholar
  3. Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2011. Disk-locality in datacenter computing considered irrelevant. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS’11), Vol. 13. 12–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Gagangeet Singh Aujla, Anish Jindal, Neeraj Kumar, and Mukesh Singh. 2016. SDN-based data center energy management system using RES and electric vehicles. In Proceedings of the IEEE Global Communications Conference (GLOBECOM’16).Google ScholarGoogle ScholarCross RefCross Ref
  5. Xiangping Bu, Jia Rao, and Cheng-zhong Xu. 2013. Interference and locality-aware task scheduling for MapReduce applications in virtual clusters. In Proceedings of the 22nd International Symposium on High-performance Parallel and Distributed Computing. ACM, 227–238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Hyunseok Chang, Murali Kodialam, Ramana Rao Kompella, T. V. Lakshman, Myungjin Lee, and Sarit Mukherjee. 2011. Scheduling in mapreduce-like systems for fast completion time. In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM’11). IEEE, 3074–3082.Google ScholarGoogle ScholarCross RefCross Ref
  7. Yanpei Chen, Sara Alspaugh, Dhruba Borthakur, and Randy Katz. 2012. Energy efficiency for large-scale mapreduce workloads with significant interactive analysis. In Proceedings of the 7th ACM European Conference on Computer Systems. ACM, 43–56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Zheyi Chen, Jia Hu, Geyong Min, Albert Y. Zomaya, and Tarek El-Ghazawi. 2019. Towards accurate prediction for high-dimensional and highly-variable cloud workloads with deep learning. IEEE Trans. Parallel Distrib. Syst. 31, 4 (2019), 923–934.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dazhao Cheng, Jia Rao, Yanfei Guo, Changjun Jiang, and Xiaobo Zhou. 2017. Improving performance of heterogeneous mapreduce clusters with adaptive task tuning. IEEE Trans. Parallel Distrib. Syst. 28, 3 (2017), 774–786. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gary Cook. 2012. How clean is your cloud? Catalysing an energy revolution. Greenpeace Int. (2012). https://www.greenpeace.org/static/planet4-international-stateless/2012/04/e7c8ff21-howcleanisyourcloud.pdf.Google ScholarGoogle Scholar
  11. Miyuru Dayarathna, Yonggang Wen, and Rui Fan. 2016. Data center energy consumption modeling: A survey. IEEE Commun. Surveys Tutor. 18, 1 (2016), 732–794.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51, 1 (2008), 107–113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Mansi S. Gaglani. 2011. A Study on Transportation Problem, Transshipment Problem, Assignment Problem and Supply Chain Management. Ph.D. Dissertation. Saurashtra University.Google ScholarGoogle Scholar
  14. Sahil Garg, Kuljeet Kaur, Neeraj Kumar, Shalini Batra, and Mohammad S. Obaidat. 2018. HyClass: Hybrid classification model for anomaly detection in cloud environment. In Proceedings of the IEEE International Conference on Communications (ICC’18).Google ScholarGoogle Scholar
  15. S. Garg, K. Kaur, N. Kumar, G. Kaddoum, A. Y. Zomaya, and R. Ranjan. 2019. A hybrid deep learning based model for anomaly detection in cloud datacentre networks. IEEE Trans. Netw. Service Manage. 16, 3 (2019), 924--35. DOI:10.1109/TNSM.2019.2927886Google ScholarGoogle ScholarCross RefCross Ref
  16. Íñigo Goiri, Kien Le, Thu D. Nguyen, Jordi Guitart, Jordi Torres, and Ricardo Bianchini. 2012. GreenHadoop: Leveraging green energy in data-processing frameworks. In Proceedings of the 7th ACM European Conference on Computer Systems. ACM, 57–70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Make IT Green. 2010. Cloud computing and its contribution to climate change. Greenpeace Int. (2010). https://www.greenpeace.org/static/planet4-international-stateless/2010/03/f2954209-make-it-green-cloud-computing.pdf.Google ScholarGoogle Scholar
  18. James Hamilton. 2009. Cooperative expendable micro-slice servers (CEMS): Low cost, low power servers for internet-scale services. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’09). Citeseer.Google ScholarGoogle Scholar
  19. Shadi Ibrahim, Hai Jin, Lu Lu, Bingsheng He, Gabriel Antoniu, and Song Wu. 2012. Maestro: Replica-aware map scheduling for mapreduce. In Proceedings of the 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid’12). IEEE, 435–442. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kuljeet Kaur, Sahil Garg, Neeraj Kumar, Gagangeet Singh Aujla, Kim Kwang Raymond Choo, and Mohammad S. Obaidat. 2019. An adaptive grid frequency support mechanism for energy management in cloud data centers. IEEE Syst. J. 14, 1 (2019), 1195--205. DOI:10.1109/JSYST.2019.2921592Google ScholarGoogle ScholarCross RefCross Ref
  21. Kujeet Kaur, Neeraj Kumar, Sahil Garg, and Joel J. P. C. Rodrigues. 2018. EnLoc: Data locality-aware energy-efficient scheduling scheme for cloud data centers. In Proceedings of the IEEE International Conference on Communications (ICC’18).Google ScholarGoogle Scholar
  22. Neeraj Kumar, Gagangeet Singh Aujla, Sahil Garg, Kuljeet Kaur, Rajiv Ranjan, and Saurabh Kumar Garg. 2018. Renewable energy-based multi-indexed job classification and container management scheme for sustainability of cloud data centers. IEEE Trans. Industr. Inform. 15, 5 (2018), 2947–2957.Google ScholarGoogle ScholarCross RefCross Ref
  23. Willis Lang and Jignesh M. Patel. 2010. Energy management for MapReduce clusters. Proc. VLDB Endow. 3, 1–2 (Sept. 2010), 129–139. DOI:https://doi.org/10.14778/1920841.1920862 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jacob Leverich and Christos Kozyrakis. 2010. On the energy (in) efficiency of hadoop clusters. ACM SIGOPS Operat. Syst. Rev. 44, 1 (2010), 61–65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Tingpeng Li, Yue Li, and Yanling Qian. 2016. Improved Hungarian algorithm for assignment problems of serial-parallel systems. J. Syst. Eng. Electr. 27, 4 (2016), 858–870.Google ScholarGoogle ScholarCross RefCross Ref
  26. Lena Mashayekhy, Mahyar Movahed Nejad, Daniel Grosu, Quan Zhang, and Weisong Shi. 2015. Energy-aware scheduling of mapreduce jobs for big data applications. IEEE Trans. Parallel Distrib. Syst. 26, 10 (2015), 2720–2733. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Benjamin Moseley, Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós. 2011. On scheduling in map-reduce and flow-shops. In Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 289–298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Radheshyam Nanduri, Nitesh Maheshwari, A. Reddyraja, and Vasudeva Varma. 2011. Job aware scheduling algorithm for mapreduce framework. In Proceedings of the IEEE 3rd International Conference on Cloud Computing Technology and Science (CloudCom’11). IEEE, 724–729. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Mario Pastorelli, Antonio Barbuzzi, Damiano Carra, Matteo Dell’Amico, and Pietro Michiardi. 2013. HFSP: Size-based scheduling for hadoop. In Proceedings of the IEEE International Conference on Big Data (BigData’13). IEEE, 51–59.Google ScholarGoogle ScholarCross RefCross Ref
  30. Zujie Ren, Jian Wan, Weisong Shi, Xianghua Xu, and Min Zhou. 2014. Workload analysis, implications, and optimization on a production hadoop cluster: A case study on taobao. IEEE Trans. Services Comput. 7, 2 (2014), 307–321.Google ScholarGoogle ScholarCross RefCross Ref
  31. Thomas Sandholm and Kevin Lai. 2010. Dynamic proportional share scheduling in hadoop. In Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing. Springer, 110–131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Amritpal Singh, Sahil Garg, Kuljeet Kaur, Shalini Batra, Neeraj Kumar, and Kim-Kwang Raymond Choo. 2018. Fuzzy-folded bloom filter-as-a-service for big data storage in the cloud. IEEE Trans. Industr. Inform. 15, 4 (2018), 2338–2348.Google ScholarGoogle ScholarCross RefCross Ref
  33. Jie Song, Xuebing Liu, Zhiliang Zhu, Dazhe Zhao, and Ge Yu. 2014. A novel task scheduling approach for reducing energy consumption of mapreduce cluster. IETE Techn. Rev. 31, 1 (2014), 65–74.Google ScholarGoogle ScholarCross RefCross Ref
  34. Morgan Tatchell-Evans, Nik Kapur, Jonathan Summers, Harvey Thompson, and Dan Oldham. 2017. An experimental and theoretical investigation of the extent of bypass air within data centres employing aisle containment, and its impact on power consumption. Appl. Energy 186 (2017), 457–469.Google ScholarGoogle ScholarCross RefCross Ref
  35. Abhishek Verma, Ludmila Cherkasova, and Roy H. Campbell. 2012. Two sides of a coin: Optimizing the schedule of mapreduce jobs to minimize their makespan and improve cluster performance. In Proceedings of the IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems. IEEE, 11–18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xite Wang, Derong Shen, Ge Yu, Tiezheng Nie, and Yue Kou. 2013. A throughput driven task scheduler for improving mapreduce performance in job-intensive environments. In Proceedings of the IEEE International Congress on Big Data (BigData’13). IEEE, 211–218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Xiaoli Wang, Yuping Wang, and Yue Cui. 2016. An energy-aware bi-level optimization model for multi-job scheduling problems under cloud computing. Soft Comput. 20, 1 (2016), 303–317. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Tom White. 2012. Hadoop: The Definitive Guide. O’Reilly Media. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Joel Wolf, Deepak Rajan, Kirsten Hildrum, Rohit Khandekar, Vibhore Kumar, Sujay Parekh, Kun-Lung Wu, and Andrey Balmin. 2010. Flex: A slot allocation scheduling optimizer for mapreduce workloads. In Proceedings of the ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing. Springer, 1–20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Matei Zaharia, Dhruba Borthakur, J. Sen Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica. 2009. Job scheduling for multi-user mapreduce clusters. EECS Department, University of California, Berkeley, Technical Report No. UCB/EECS-2009-55.Google ScholarGoogle Scholar
  41. Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy H. Katz, and Ion Stoica. 2008. Improving MapReduce performance in heterogeneous environments. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI’08), Vol. 8, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Marina Zapater, José L. Risco-Martín, Patricia Arroba, José L. Ayala, José M. Moya, and Román Hermida. 2016. Runtime data center temperature prediction using Grammatical Evolution techniques. Appl. Soft Comput. 49 (2016), 94–107. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Energy and SLA-driven MapReduce Job Scheduling Framework for Cloud-based Cyber-Physical Systems

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Internet Technology
        ACM Transactions on Internet Technology  Volume 21, Issue 2
        June 2021
        599 pages
        ISSN:1533-5399
        EISSN:1557-6051
        DOI:10.1145/3453144
        • Editor:
        • Ling Liu
        Issue’s Table of Contents

        Copyright © 2021 Association for Computing Machinery.

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 3 May 2021
        • Accepted: 1 July 2020
        • Revised: 1 June 2020
        • Received: 1 April 2020
        Published in toit Volume 21, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format