Skip to main content
Log in

Cost-driven workflow scheduling on the cloud with deadline and reliability constraints

  • Published:
Computing Aims and scope Submit manuscript

Abstract

Clouds are becoming an effective platform for scientific workflow applications. In the meantime, Cloud computing structures are moving towards being more heterogeneous. In heterogeneous service-oriented systems, managing the reliability of resources (e.g., processors and communication networks) is widely identified as a critical issue due to processor and communication failures affecting user quality of service requirements. Therefore, these types of failures should be taken into account when scheduling algorithms. The present paper proposes a scheduling approach which includes four algorithms for minimizing the workflow execution cost while also meeting the user-specified deadline and reliability. To meet the application’s requirements, the first algorithm partitions the workflow into several clusters based on a critical parent called CbCP. After that, the resource assignment algorithm, consisting of reliability and deadline distribution methods, satisfies the application’s constraints. Experimental outcomes on various workflows, generated at different scales in real and random fashion, demonstrate that the proposed heuristics meet the deadline and reliability. This ensures the minimal cost when performing a similar quality of service as opposed to the performance of the state-of-the-art DRR and QFEC+ algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Cai Z, Li X, Gupta JND (2016) Heuristics for provisioning services to workflows in XaaS clouds. IEEE Trans Serv Comput 9(2):250–263

    Article  Google Scholar 

  2. Zhu X, Wang J, Guo H, Zhu D, Yang LT, Liu L (2016) Fault-tolerant scheduling for real-time scientific workflows with elastic resource provisioning in virtualized clouds. IEEE Trans Parallel Distrib Syst 99:1

    Article  Google Scholar 

  3. Zhou A et al (2016) Cloud service reliability enhancement via virtual machine placement optimization. IEEE Trans Serv Comput 10(6):902–913

    Article  Google Scholar 

  4. Zhao L, Ren Y, Sakurai K (2013) Reliable workflow scheduling with less resource redundancy. Parallel Comput 39(10):567–585

    Article  MathSciNet  Google Scholar 

  5. Qiu W, Zheng Z, Wang X, Yang X, Lyu MR (2014) Reliability-based design optimization for cloud migration. IEEE Trans Serv Comput 7(2):223–236

    Article  Google Scholar 

  6. Silic M, Delac G, Srbljic S (2015) Prediction of atomic web services reliability for QoS-aware recommendation. IEEE Trans Serv Comput 8(3):425–438

    Article  Google Scholar 

  7. Bajaj R, Agrawal DP (2004) Improving scheduling of tasks in a heterogeneous environment. IEEE Trans Parallel Distrib Syst 15(2):107–118

    Article  Google Scholar 

  8. Daoud MI, Kharma N (2008) A high performance algorithm for static task scheduling in heterogeneous distributed computing systems. J Parallel Distrib Comput 68(4):399–409

    Article  Google Scholar 

  9. Wieczorek M, Hoheisel A, Prodan R (2009) Towards a general model of the multi-criteria workflow scheduling on the grid. Future Gener Comput Syst 25:237–256

    Article  Google Scholar 

  10. Yu J, Kirley M, Buyya R (2007) Multi-objective planning for workflow execution on Grids. In: Proceedings of the 8th IEEE/ACM international conference on grid computing, pp 10–17

  11. Benoit A, Hakem M, Robert Y (2008) Fault tolerant scheduling of precedence task graphs on heterogeneous platforms. In: IPDPS Miami 2008—proceedings of the 22nd IEEE International symposium on parallel and distributed processing CD-ROM, vol 33, no. December 2007

  12. Benoit A, Hakem M, Robert Y (2009) Contention awareness and fault-tolerant scheduling for precedence constrained tasks in heterogeneous systems. Parallel Comput 35(2):83–108

    Article  Google Scholar 

  13. Girault A, Kalla H (2009) A novel bicriteria scheduling heuristics providing a guaranteed global system failure rate. IEEE Trans Dependable Secur Comput 6(4):241–254

    Article  Google Scholar 

  14. Zheng Q, Veeravalli B (2009) On the design of communication-aware fault-tolerant scheduling algorithms for precedence constrained tasks in grid computing systems with dedicated communication devices. J Parallel Distrib Comput 69(3):282–294

    Article  Google Scholar 

  15. Zheng Q, Veeravalli B, Tham CK (2009) On the design of fault-tolerant scheduling strategies using primary-backup approach for computational grids with low replication costs. IEEE Trans Comput 58(3):380–393

    Article  MathSciNet  Google Scholar 

  16. Xie G, Zeng G, Li R, Li K (2017) Quantitative fault-tolerance for reliable workflows on heterogeneous IaaS clouds. IEEE Trans Cloud Comput. https://doi.org/10.1109/TCC.2017.2780098

    Article  Google Scholar 

  17. Arabnejad H, Barbosa JG (2014) A budget constrained scheduling algorithm for workflow applications. J Grid Comput 12(4):665–679

    Article  Google Scholar 

  18. Sakellariou R, Zhao H, Tsiakkouri E, Dikaiakos MD (2007) Scheduling workflows with budget constraints. In: Integrated research in GRID computing CoreGRID integration workshop 2005 Selected Papers, pp 189–202

  19. Su S, Li J, Huang Q, Huang X, Shuang K, Wang J (2013) Cost-efficient task scheduling for executing large programs in the cloud. Parallel Comput 39(4–5):177–188

    Article  Google Scholar 

  20. Szabo C, Kroeger T (2012) Evolving multi-objective strategies for task allocation of scientific workflows on public clouds. In: 2012 IEEE congress on evolutionary computation, CEC 2012, pp 10–15

  21. Kianpisheh S, Charkari NM (2014) A grid workflow quality-of-service estimation based on resource availability prediction. J Supercomput 67(2):496–527

    Article  Google Scholar 

  22. Xie G et al (2017) Minimizing redundancy to satisfy reliability requirement for a parallel application on heterogeneous service-oriented systems. IEEE Trans Serv Comput. https://doi.org/10.1109/TSC.2017.2665552

    Article  Google Scholar 

  23. Qin X, Jiang H, Swanson DR (2002) An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems. In: parallel processing. 2002. Proceedings international conference on, pp 360–368

  24. Benoit A, Hakem M, Robert Y (2009) Optimizing the latency of streaming applications under throughput and reliability constraint. In: Proceedings international conference on parallel processing, pp 325–332

  25. Zhao L, Ren Y, Sakurai K (2011) A resource minimizing scheduling algorithm with ensuring the deadline and reliability in heterogeneous systems. In: Proceedings of the international conference on advanced information networking and applications AINA, pp 275–282

  26. Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener Comput Syst 25(5):528–540

    Article  Google Scholar 

  27. Naghibzadeh M (2016) Modeling and scheduling hybrid workflows of tasks and task interaction graphs on the cloud. Future Gener Comput Syst 65:33–45

    Article  Google Scholar 

  28. Benoit A, Canon LC, Jeannot E, Robert Y (2012) Reliability of task graph schedules with transient and fail-stop failures: complexity and algorithms. J Sched 15(5):615–627

    Article  MathSciNet  Google Scholar 

  29. Deldari A, Naghibzadeh M, Abrishami S (2017) CCA: a deadline-constrained workflow scheduling algorithm for multicore resources on the cloud. J Supercomput 73:756–781

    Article  Google Scholar 

  30. Bittencourt LF, Madeira ERM (2010) Towards the scheduling of multiple workflows on computational grids. J Grid Comput 8:419–441

    Article  Google Scholar 

  31. Bittencourt LF, Madeira ERM (2008) A performance-oriented adaptive scheduler for dependent tasks on grids. Concurr Comput Pract Exp 20:1029–1049

    Article  Google Scholar 

  32. Topcuoglu H, Hariri S (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13:260–274

    Article  Google Scholar 

  33. Girault A, Kalla H, Sighireanu M, Sorel Y (2003) An algorithm for automatically obtaining distributed and fault-tolerant static schedules. In: Proceeding of international conference on dependable systems and networks. pp 159–168

  34. Ranaweera S, Agrawal DP (2000) A task duplication based scheduling algorithm for heterogeneous systems. In: Proceedings 14th international parallel and distributed processing symposium 2000. IPDPS 2000. pp 445–450

  35. Bharathi S, Chervenak A, Deelman E, Mehta G, Su MH, Vahi K (2008) Characterization of scientific workflows. In: 2008 3rd Work. Work. Support large-scale sci. work. no. June 2014

Download references

Acknowledgements

The authors would like to express their gratitude to the anonymous reviewers for their constructive comments which have helped to improve the quality of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahmoud Naghibzadeh.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mousavi Nik, S.S., Naghibzadeh, M. & Sedaghat, Y. Cost-driven workflow scheduling on the cloud with deadline and reliability constraints. Computing 102, 477–500 (2020). https://doi.org/10.1007/s00607-019-00740-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-019-00740-5

Keywords

Mathematics Subject Classification

Navigation