当前位置: X-MOL 学术J. Supercomput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scheduling large-scale scientific workflow on virtual machines with different numbers of vCPUs
The Journal of Supercomputing ( IF 3.3 ) Pub Date : 2020-04-23 , DOI: 10.1007/s11227-020-03273-3
Hao Wu , Xin Chen , Xiaoyu Song , Chi Zhang , He Guo

With the wide deployment of cloud computing in scientific computing, cost minimization is increasingly critical for large-scale scientific workflow. Unfortunately, due to the highly intricate directed acyclic graph (DAG)-based workflow and the flexible usage of virtual machines (VMs) in cloud platform, the existing workflow scheduling approaches are inefficient to strike a balance between the parallelism and the topology of the DAG-based workflow while using the VMs, which causes a low utilization of VMs and consumes more cost. To address these issues, this paper presents a novel task scheduling framework named cost minimization approach with the DAG splitting method (COMSE) for minimizing the cost of running a deadline-constrained large-scale scientific workflow. First, we provide comprehensive theoretical analyses on how to improve the utilization of a resource-balanced multi-vCPU VM for running multiple tasks simultaneously. Second, considering the balance between the parallelism and the topology of a workflow, we simplify the DAG-based workflow, and based on the simplified DAG, a DAG splitting method is devised to preprocess the workflow. Third, since the cloud is charged by hours, we also design an exact algorithm to find the optimal operation pattern for a given schedule to make the consumed instance hours minimum, and this algorithm is named as instance hours minimization by Dijkstra (TOID). Finally, by employing the DAG splitting method and the TOID, the COMSE schedules a deadline-constrained large-scale scientific workflow on the multi-vCPU VMs and incorporates two important objects: minimizing the computation cost and the communication cost. Our solution approach is evaluated through rigorous performance evaluation study using real-word workflows, and the results show that the proposed COMSE approach outperforms existing algorithms in terms of computation cost and communication cost.

中文翻译:

在具有不同 vCPU 数量的虚拟机上调度大规模科学工作流

随着云计算在科学计算中的广泛部署,成本最小化对于大规模科学工作流越来越重要。不幸的是,由于高度复杂的基于有向无环图(DAG)的工作流和云平台中虚拟机(VM)的灵活使用,现有的工作流调度方法无法在 DAG 的并行性和拓扑之间取得平衡。使用虚拟机时基于工作流,导致虚拟机利用率低,消耗更多成本。为了解决这些问题,本文提出了一种名为成本最小化方法的新型任务调度框架,该框架采用 DAG 分裂方法 (COMSE) 来最小化运行有期限限制的大规模科学工作流的成本。第一的,我们提供了关于如何提高资源平衡的多 vCPU VM 的利用率以同时运行多个任务的全面理论分析。其次,考虑到工作流的并行性和拓扑结构之间的平衡,我们简化了基于 DAG 的工作流,并在简化的 DAG 的基础上,设计了一种 DAG 拆分方法来对工作流进行预处理。第三,由于云是按小时计费的,我们还设计了一个精确的算法来找到给定时间表的最佳运行模式,使消耗的实例小时数最小,这个算法被 Dijkstra (TOID) 命名为实例小时数最小化。最后,通过采用 DAG 拆分方法和 TOID,COM​​SE 在多 vCPU 虚拟机上安排了截止日期受限的大规模科学工作流,并合并了两个重要对象:最小化计算成本和通信成本。我们的解决方案方法是通过严格的性能评估研究使用实际工作流进行评估的,结果表明,所提出的 COMSE 方法在计算成本和通信成本方面优于现有算法。
更新日期:2020-04-23
down
wechat
bug