当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Endpoint-Flexible Coflow Scheduling across Geo-distributed Datacenters
IEEE Transactions on Parallel and Distributed Systems ( IF 5.3 ) Pub Date : 2020-10-01 , DOI: 10.1109/tpds.2020.2992615
Wenxin Li , Xu Yuan , Keqiu Li , Heng Qi , Xiaobo Zhou , Renhai Xu

Over the last decade, we have witnessed growing data volumes generated and stored across geographically distributed datacenters. Processing such geo-distributed datasets may suffer from significant slowdown as the underlying network flows have to go through the inter-datacenter networks with relatively low and highly heterogeneous available link bandwidth. Thus, optimizing the transmissions of inter-datacenter flows, especially coflows that capture application-level semantics, is important for improving the communication performance of such geo-distributed applications. However, prior solutions on coflow scheduling have significant limitations: they schedule coflows with already-fixed endpoints of flows, making them insufficient to optimize the coflow completion time (CCT). In this article, we focus on the problem of jointly considering endpoint placement and coflow scheduling to minimize the average CCT of coflows across geo-distributed datacenters. To solve this problem without any prior knowledge of coflow arrivals, we present a coflow-aware optimization framework called SmartCoflow. In SmartCoflow, we first apply an approximate algorithm to obtain the endpoint placement and scheduling decisions for a single coflow. Based on the single-coflow solution, we then develop an efficient online algorithm to handle the dynamically arrived coflows. Through rigorous theoretical analysis, we prove that SmartCoflow has a non-trivial competitive ratio. We also extend SmartCoflow to incorporate various design choices or requirements of applications and operators, such as enforcing an inter-datacenter bandwidth usage budget and considering coflow deadline. Through experimental results from testbed implementation and trace-driven simulations, we demonstrate that SmartCoflow can reduce the average CCT, lower bandwidth usage, and improve coflow deadline meet rate, when compared to the state-of-the-art scheduling-only method.

中文翻译:

跨地域分布式数据中心的端点灵活 Coflow 调度

在过去十年中,我们目睹了跨地理分布的数据中心生成和存储的数据量不断增长。处理此类地理分布的数据集可能会受到显着放缓的影响,因为底层网络流必须通过具有相对低且高度异构的可用链路带宽的数据中心间网络。因此,优化数据中心间流的传输,尤其是捕获应用程序级语义的协流,对于提高此类地理分布式应用程序的通信性能非常重要。然而,先前关于协流调度的解决方案有很大的局限性:它们用已经固定的流端点来调度协流,这使得它们不足以优化协流完成时间 (CCT)。在本文中,我们专注于联合考虑端点放置和协流调度的问题,以最小化跨地理分布式数据中心的协流的平均 CCT。为了在没有任何 coflow 到达的先验知识的情况下解决这个问题,我们提出了一个称为 SmartCoflow 的 coflow 感知优化框架。在 SmartCoflow 中,我们首先应用近似算法来获得单个 coflow 的端点放置和调度决策。基于单协流解决方案,我们开发了一种高效的在线算法来处理动态到达的协流。通过严谨的理论分析,我们证明SmartCoflow具有非平凡的竞争力。我们还扩展了 SmartCoflow 以包含应用程序和运营商的各种设计选择或要求,例如强制执行数据中心间带宽使用预算并考虑 coflow 截止日期。通过测试台实现和跟踪驱动模拟的实验结果,我们证明与最先进的仅调度方法相比,SmartCoflow 可以降低平均 CCT、降低带宽使用并提高 coflow 截止日期满足率。
更新日期:2020-10-01
down
wechat
bug