当前位置: X-MOL 学术Cluster Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modeling and analysis of distributed schedulers in data center cluster networks
Cluster Computing ( IF 4.4 ) Pub Date : 2021-06-17 , DOI: 10.1007/s10586-021-03343-y
Reem Alshahrani , Hassan Peyravi

One of the goals of cloud service providers is to satisfy service-level agreements without significant over-provisioning in data center clusters. Efforts to meet these requirements have been mainly based on resource over-provisioning rather than identifying performance bottlenecks. While increasing parallelism tends to reduce the average and tail latency, the joint impact of concurrent job scheduling and parallel task processing is a challenging problem to analytically model, particularly when compared to the models developed without the notion of concurrency. This article presents an analytical model for distributed schedulers in data center cluster networks. The model can be used to investigate how latency can affect a data center network design and how many resources should be allocated to meet service-level agreements. To get better insight, we build upon ideas from queuing networks, which provide a framework to measure expected latency versus resource provisioning. The model is based on tandem queuing networks and fork–join systems to compute expected latency in closed forms at various stages of data center cluster networks. Theoretical analysis and simulations have been conducted to demonstrate the effectiveness of the proposed model and to strike a balance between expected latency and resource utilization. Results obtained from various simulation scenarios on different data center traffic traces confirm the soundness of the model.



中文翻译:

数据中心集群网络分布式调度器建模与分析

云服务提供商的目标之一是满足服务级别协议,而不会在数据中心集群中出现明显的过度配置。满足这些要求的努力主要基于资源过度配置,而不是识别性能瓶颈。虽然增加并行性往往会降低平均延迟和尾部延迟,但并发作业调度和并行任务处理的联合影响对于分析建模来说是一个具有挑战性的问题,尤其是与没有并发概念的模型相比时。本文介绍了数据中心集群网络中分布式调度器的分析模型。该模型可用于研究延迟如何影响数据中心网络设计以及应分配多少资源以满足服务级别协议。为了获得更好的洞察力,我们基于排队网络的想法,它提供了一个框架来衡量预期延迟与资源供应。该模型基于串联排队网络和分叉连接系统,以计算数据中心集群网络各个阶段的封闭形式的预期延迟。已经进行了理论分析和模拟,以证明所提出模型的有效性,并在预期延迟和资源利用率之间取得平衡。从不同数据中心流量轨迹的各种仿真场景中获得的结果证实了模型的稳健性。该模型基于串联排队网络和分叉连接系统,以计算数据中心集群网络各个阶段的封闭形式的预期延迟。已经进行了理论分析和模拟,以证明所提出模型的有效性,并在预期延迟和资源利用率之间取得平衡。从不同数据中心流量轨迹的各种仿真场景中获得的结果证实了模型的稳健性。该模型基于串联排队网络和分叉连接系统,以计算数据中心集群网络各个阶段的封闭形式的预期延迟。已经进行了理论分析和模拟,以证明所提出模型的有效性,并在预期延迟和资源利用率之间取得平衡。从不同数据中心流量轨迹的各种仿真场景中获得的结果证实了模型的稳健性。

更新日期:2021-06-18
down
wechat
bug