research-article

Energy and SLA-driven MapReduce Job Scheduling Framework for Cloud-based Cyber-Physical Systems

Authors:
Kuljeet Kaur

École de technologie supérieure, Quebec, Canada

École de technologie supérieure, Quebec, Canada
View Profile

,
Sahil Garg

École de technologie supérieure, Quebec, Canada

École de technologie supérieure, Quebec, Canada
View Profile

,
Georges Kaddoum

École de technologie supérieure, Quebec, Canada

École de technologie supérieure, Quebec, Canada
View Profile

,
Neeraj Kumar

Thapar Institute of Engineering & Technology, Patiala, Punjab, India

Thapar Institute of Engineering & Technology, Patiala, Punjab, India

0000-0002-3020-3947
View Profile

Authors Info & Claims

ACM Transactions on Internet Technology Volume 21 Issue 2Article No.: 31pp 1–24https://doi.org/10.1145/3409772

Published:03 May 2021Publication History

ACM Transactions on Internet Technology

Abstract

Energy consumption minimization of cloud data centers (DCs) has attracted much attention from the research community in the recent years; particularly due to the increasing dependence of emerging Cyber-Physical Systems on them. An effective way to improve the energy efficiency of DCs is by using efficient job scheduling strategies. However, the most challenging issue in selection of efficient job scheduling strategy is to ensure service-level agreement (SLA) bindings of the scheduled tasks. Hence, an energy-aware and SLA-driven job scheduling framework based on MapReduce is presented in this article. The primary aim of the proposed framework is to explore task-to-slot/container mapping problem as a special case of energy-aware scheduling in deadline-constrained scenario. Thus, this problem can be viewed as a complex multi-objective problem comprised of different constraints. To address this problem efficiently, it is segregated into three major subproblems (SPs), namely, deadline segregation, map and reduce phase energy-aware scheduling. These SPs are individually formulated using Integer Linear Programming. To solve these SPs effectively, heuristics based on Greedy strategy along with classical Hungarian algorithm for serial and serial-parallel systems are used. Moreover, the proposed scheme also explores the potential of splitting Map/Reduce phase(s) into multiple stages to achieve higher energy reductions. This is achieved by leveraging the concepts of classical Greedy approach and priority queues. The proposed scheme has been validated using real-time data traces acquired from OpenCloud. Moreover, the performance of the proposed scheme is compared with the existing schemes using different evaluation metrics, namely, number of stages, total energy consumption, total makespan, and SLA violated. The results obtained prove the efficacy of the proposed scheme in comparison to the other schemes under different workload scenarios.

References

Emerson Network Power. [n.d.]. Energy logic: Reducing data center energy consumption by creating savings that cascade across systems. Emerson Network Power. A White Paper from the Experts in Business-Critical Continuity.Google Scholar
Carnegie Mellon University. [n.d.]. OpenCloud Hadoop cluster trace: Format and schema. Retrieved from http://ftp.pdl.cmu.edu/pub/datasets/hla/dataset.html.Google Scholar
Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2011. Disk-locality in datacenter computing considered irrelevant. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS’11), Vol. 13. 12–12. Google ScholarDigital Library
Gagangeet Singh Aujla, Anish Jindal, Neeraj Kumar, and Mukesh Singh. 2016. SDN-based data center energy management system using RES and electric vehicles. In Proceedings of the IEEE Global Communications Conference (GLOBECOM’16).Google ScholarCross Ref
Xiangping Bu, Jia Rao, and Cheng-zhong Xu. 2013. Interference and locality-aware task scheduling for MapReduce applications in virtual clusters. In Proceedings of the 22nd International Symposium on High-performance Parallel and Distributed Computing. ACM, 227–238. Google ScholarDigital Library
Hyunseok Chang, Murali Kodialam, Ramana Rao Kompella, T. V. Lakshman, Myungjin Lee, and Sarit Mukherjee. 2011. Scheduling in mapreduce-like systems for fast completion time. In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM’11). IEEE, 3074–3082.Google ScholarCross Ref
Yanpei Chen, Sara Alspaugh, Dhruba Borthakur, and Randy Katz. 2012. Energy efficiency for large-scale mapreduce workloads with significant interactive analysis. In Proceedings of the 7th ACM European Conference on Computer Systems. ACM, 43–56. Google ScholarDigital Library
Zheyi Chen, Jia Hu, Geyong Min, Albert Y. Zomaya, and Tarek El-Ghazawi. 2019. Towards accurate prediction for high-dimensional and highly-variable cloud workloads with deep learning. IEEE Trans. Parallel Distrib. Syst. 31, 4 (2019), 923–934.Google ScholarDigital Library
Dazhao Cheng, Jia Rao, Yanfei Guo, Changjun Jiang, and Xiaobo Zhou. 2017. Improving performance of heterogeneous mapreduce clusters with adaptive task tuning. IEEE Trans. Parallel Distrib. Syst. 28, 3 (2017), 774–786. Google ScholarDigital Library
Gary Cook. 2012. How clean is your cloud? Catalysing an energy revolution. Greenpeace Int. (2012). https://www.greenpeace.org/static/planet4-international-stateless/2012/04/e7c8ff21-howcleanisyourcloud.pdf.Google Scholar
Miyuru Dayarathna, Yonggang Wen, and Rui Fan. 2016. Data center energy consumption modeling: A survey. IEEE Commun. Surveys Tutor. 18, 1 (2016), 732–794.Google ScholarDigital Library
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51, 1 (2008), 107–113. Google ScholarDigital Library
Mansi S. Gaglani. 2011. A Study on Transportation Problem, Transshipment Problem, Assignment Problem and Supply Chain Management. Ph.D. Dissertation. Saurashtra University.Google Scholar
Sahil Garg, Kuljeet Kaur, Neeraj Kumar, Shalini Batra, and Mohammad S. Obaidat. 2018. HyClass: Hybrid classification model for anomaly detection in cloud environment. In Proceedings of the IEEE International Conference on Communications (ICC’18).Google Scholar
S. Garg, K. Kaur, N. Kumar, G. Kaddoum, A. Y. Zomaya, and R. Ranjan. 2019. A hybrid deep learning based model for anomaly detection in cloud datacentre networks. IEEE Trans. Netw. Service Manage. 16, 3 (2019), 924--35. DOI:10.1109/TNSM.2019.2927886Google ScholarCross Ref
Íñigo Goiri, Kien Le, Thu D. Nguyen, Jordi Guitart, Jordi Torres, and Ricardo Bianchini. 2012. GreenHadoop: Leveraging green energy in data-processing frameworks. In Proceedings of the 7th ACM European Conference on Computer Systems. ACM, 57–70. Google ScholarDigital Library
Make IT Green. 2010. Cloud computing and its contribution to climate change. Greenpeace Int. (2010). https://www.greenpeace.org/static/planet4-international-stateless/2010/03/f2954209-make-it-green-cloud-computing.pdf.Google Scholar
James Hamilton. 2009. Cooperative expendable micro-slice servers (CEMS): Low cost, low power servers for internet-scale services. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’09). Citeseer.Google Scholar
Shadi Ibrahim, Hai Jin, Lu Lu, Bingsheng He, Gabriel Antoniu, and Song Wu. 2012. Maestro: Replica-aware map scheduling for mapreduce. In Proceedings of the 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid’12). IEEE, 435–442. Google ScholarDigital Library
Kuljeet Kaur, Sahil Garg, Neeraj Kumar, Gagangeet Singh Aujla, Kim Kwang Raymond Choo, and Mohammad S. Obaidat. 2019. An adaptive grid frequency support mechanism for energy management in cloud data centers. IEEE Syst. J. 14, 1 (2019), 1195--205. DOI:10.1109/JSYST.2019.2921592Google ScholarCross Ref
Kujeet Kaur, Neeraj Kumar, Sahil Garg, and Joel J. P. C. Rodrigues. 2018. EnLoc: Data locality-aware energy-efficient scheduling scheme for cloud data centers. In Proceedings of the IEEE International Conference on Communications (ICC’18).Google Scholar
Neeraj Kumar, Gagangeet Singh Aujla, Sahil Garg, Kuljeet Kaur, Rajiv Ranjan, and Saurabh Kumar Garg. 2018. Renewable energy-based multi-indexed job classification and container management scheme for sustainability of cloud data centers. IEEE Trans. Industr. Inform. 15, 5 (2018), 2947–2957.Google ScholarCross Ref
Willis Lang and Jignesh M. Patel. 2010. Energy management for MapReduce clusters. Proc. VLDB Endow. 3, 1–2 (Sept. 2010), 129–139. DOI:https://doi.org/10.14778/1920841.1920862 Google ScholarDigital Library
Jacob Leverich and Christos Kozyrakis. 2010. On the energy (in) efficiency of hadoop clusters. ACM SIGOPS Operat. Syst. Rev. 44, 1 (2010), 61–65. Google ScholarDigital Library
Tingpeng Li, Yue Li, and Yanling Qian. 2016. Improved Hungarian algorithm for assignment problems of serial-parallel systems. J. Syst. Eng. Electr. 27, 4 (2016), 858–870.Google ScholarCross Ref
Lena Mashayekhy, Mahyar Movahed Nejad, Daniel Grosu, Quan Zhang, and Weisong Shi. 2015. Energy-aware scheduling of mapreduce jobs for big data applications. IEEE Trans. Parallel Distrib. Syst. 26, 10 (2015), 2720–2733. Google ScholarDigital Library
Benjamin Moseley, Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós. 2011. On scheduling in map-reduce and flow-shops. In Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 289–298. Google ScholarDigital Library
Radheshyam Nanduri, Nitesh Maheshwari, A. Reddyraja, and Vasudeva Varma. 2011. Job aware scheduling algorithm for mapreduce framework. In Proceedings of the IEEE 3rd International Conference on Cloud Computing Technology and Science (CloudCom’11). IEEE, 724–729. Google ScholarDigital Library
Mario Pastorelli, Antonio Barbuzzi, Damiano Carra, Matteo Dell’Amico, and Pietro Michiardi. 2013. HFSP: Size-based scheduling for hadoop. In Proceedings of the IEEE International Conference on Big Data (BigData’13). IEEE, 51–59.Google ScholarCross Ref
Zujie Ren, Jian Wan, Weisong Shi, Xianghua Xu, and Min Zhou. 2014. Workload analysis, implications, and optimization on a production hadoop cluster: A case study on taobao. IEEE Trans. Services Comput. 7, 2 (2014), 307–321.Google ScholarCross Ref
Thomas Sandholm and Kevin Lai. 2010. Dynamic proportional share scheduling in hadoop. In Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing. Springer, 110–131. Google ScholarDigital Library
Amritpal Singh, Sahil Garg, Kuljeet Kaur, Shalini Batra, Neeraj Kumar, and Kim-Kwang Raymond Choo. 2018. Fuzzy-folded bloom filter-as-a-service for big data storage in the cloud. IEEE Trans. Industr. Inform. 15, 4 (2018), 2338–2348.Google ScholarCross Ref
Jie Song, Xuebing Liu, Zhiliang Zhu, Dazhe Zhao, and Ge Yu. 2014. A novel task scheduling approach for reducing energy consumption of mapreduce cluster. IETE Techn. Rev. 31, 1 (2014), 65–74.Google ScholarCross Ref
Morgan Tatchell-Evans, Nik Kapur, Jonathan Summers, Harvey Thompson, and Dan Oldham. 2017. An experimental and theoretical investigation of the extent of bypass air within data centres employing aisle containment, and its impact on power consumption. Appl. Energy 186 (2017), 457–469.Google ScholarCross Ref
Abhishek Verma, Ludmila Cherkasova, and Roy H. Campbell. 2012. Two sides of a coin: Optimizing the schedule of mapreduce jobs to minimize their makespan and improve cluster performance. In Proceedings of the IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems. IEEE, 11–18. Google ScholarDigital Library
Xite Wang, Derong Shen, Ge Yu, Tiezheng Nie, and Yue Kou. 2013. A throughput driven task scheduler for improving mapreduce performance in job-intensive environments. In Proceedings of the IEEE International Congress on Big Data (BigData’13). IEEE, 211–218. Google ScholarDigital Library
Xiaoli Wang, Yuping Wang, and Yue Cui. 2016. An energy-aware bi-level optimization model for multi-job scheduling problems under cloud computing. Soft Comput. 20, 1 (2016), 303–317. Google ScholarDigital Library
Tom White. 2012. Hadoop: The Definitive Guide. O’Reilly Media. Google ScholarDigital Library
Joel Wolf, Deepak Rajan, Kirsten Hildrum, Rohit Khandekar, Vibhore Kumar, Sujay Parekh, Kun-Lung Wu, and Andrey Balmin. 2010. Flex: A slot allocation scheduling optimizer for mapreduce workloads. In Proceedings of the ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing. Springer, 1–20. Google ScholarDigital Library
Matei Zaharia, Dhruba Borthakur, J. Sen Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica. 2009. Job scheduling for multi-user mapreduce clusters. EECS Department, University of California, Berkeley, Technical Report No. UCB/EECS-2009-55.Google Scholar
Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy H. Katz, and Ion Stoica. 2008. Improving MapReduce performance in heterogeneous environments. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI’08), Vol. 8, 7. Google ScholarDigital Library
Marina Zapater, José L. Risco-Martín, Patricia Arroba, José L. Ayala, José M. Moya, and Román Hermida. 2016. Runtime data center temperature prediction using Grammatical Evolution techniques. Appl. Soft Comput. 49 (2016), 94–107. Google ScholarDigital Library

Index Terms

Energy and SLA-driven MapReduce Job Scheduling Framework for Cloud-based Cyber-Physical Systems
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed algorithms
      1. MapReduce algorithms
2. Mathematics of computing
  1. Mathematical analysis
    1. Mathematical optimization
      1. Mixed discrete-continuous optimization
        Integer programming

Recommendations

Online Flexible Job Scheduling for Minimum Span
SPAA '17: Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures

In this paper, we study an online Flexible Job Scheduling (FJS) problem. The input of the problem is a set of jobs, each having an arrival time, a starting deadline and a processing length. Each job has to be started by the scheduler between its arrival ...
Read More
Job scheduling to minimize the weighted waiting time variance of jobs

This study considers the job scheduling problem of minimizing the weighted waiting time variance (WWTV) of jobs. It is an extension of WTV minimization problems in which we schedule a batch of n jobs, for servicing on a single resource, in such a way ...
Read More
An Improved Job Scheduling Algorithm by Utilizing Released Resources for MapReduce
EAIT '14: Proceedings of the 2014 Fourth International Conference of Emerging Applications of Information Technology

MapReduce has become one standard for big data processing in Cloud computing environment. However job scheduling in this model is always a challenge for the research fraternity and several job scheduling algorithms have already been proposed by ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Internet Technology Volume 21, Issue 2
June 2021
599 pages
ISSN:1533-5399
EISSN:1557-6051
DOI:10.1145/3453144
Editor:
Ling Liu
Georgia Institute of Technology, USA
Issue’s Table of Contents
Copyright © 2021 Association for Computing Machinery.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 May 2021
- Accepted: 1 July 2020
- Revised: 1 June 2020
- Received: 1 April 2020
Published in toit Volume 21, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Cyber-physical systems
energy optimization
job scheduling
greedy approach
Hungarian algorithm
and MapReduce
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 94
  Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Energy and SLA-driven MapReduce Job Scheduling Framework for Cloud-based Cyber-Physical Systems

ACM Transactions on Internet Technology

Abstract

References

Cited By

Index Terms

Recommendations

Online Flexible Job Scheduling for Minimum Span

Job scheduling to minimize the weighted waiting time variance of jobs

An Improved Job Scheduling Algorithm by Utilizing Released Resources for MapReduce

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Energy and SLA-driven MapReduce Job Scheduling Framework for Cloud-based Cyber-Physical Systems

ACM Transactions on Internet Technology

Abstract

References

Cited By

Index Terms

Recommendations

Online Flexible Job Scheduling for Minimum Span

Job scheduling to minimize the weighted waiting time variance of jobs

An Improved Job Scheduling Algorithm by Utilizing Released Resources for MapReduce

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media