当前位置: X-MOL 学术J. Parallel Distrib. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data placement in distributed data centers for improved SLA and network cost
Journal of Parallel and Distributed Computing ( IF 3.4 ) Pub Date : 2020-08-26 , DOI: 10.1016/j.jpdc.2020.07.006
Yuqi Fan , Chen Wang , Bei Zhang , Shuyang Gu , Weili Wu , Dingzhu Du

Large-scale data-intensive applications provide services to users by routing service requests to geographically distributed data centers interconnected by Internet links. In order to achieve good reliability and data access latency performance, cloud service providers often simultaneously place multiple copies of the data in different data centers. The network communication required for updating the multiple data copies incurs an operational cost. At the same time, the penalty incurred by the Service Level Agreement (SLA) violation for data access from the data centers also imposes an operational cost on the service providers. In this paper, we tackle the problem of data placement in distributed data centers with the aim to minimize the operational cost incurred by delay SLA violation penalty and inter-data center network communication, assuming each data has K data replicas. We propose a K-level Cluster-based Data Placement algorithm (K-CDP) for the problem. The algorithm solves the linear programming relaxation and dual programming problems corresponding to the problem of minimizing SLA violation penalty cost caused by placing a replica of each data in a data center. Based on the obtained solutions, the algorithm clusters the data so that the data with similar placeable data centers form a data cluster. For the data in each cluster, the algorithm selects K data centers to minimize the operational cost. We prove that algorithm K-CDP is 2-approximation to the data placement problem. Our simulation results demonstrate that the proposed algorithm can effectively reduce the penalty cost incurred by delay SLA violation, the network communication cost, and the operational cost of data centers.



中文翻译:

将数据放置在分布式数据中心中,以提高SLA和网络成本

大型数据密集型应用程序通过将服务请求路由到通过Internet链接互连的地理上分散的数据中心来向用户提供服务。为了获得良好的可靠性和数据访问延迟性能,云服务提供商通常会同时将数据的多个副本放置在不同的数据中心中。更新多个数据副本所需的网络通信会产生运营成本。同时,由于违反服务水平协议(SLA)导致从数据中心访问数据而产生的罚款也给服务提供商带来了运营成本。本文旨在解决分布式数据中心中的数据放置问题,以最大程度地减少延迟SLA违规罚款和数据中心间网络通信带来的运营成本,ķ数据副本。我们针对该问题提出了一种基于K级基于簇的数据放置算法(K-CDP)。该算法解决了线性编程松弛和双重编程问题,该问题对应于最小化由于将每个数据的副本放置在数据中心中而引起的SLA违规损失成本的问题。基于获得的解决方案,该算法对数据进行聚类,以便具有类似可放置数据中心的数据形成数据聚类。对于每个聚类中的数据,算法选择ķ数据中心以最小化运营成本。我们证明算法K-CDP是数据放置问题的2个近似值。仿真结果表明,该算法可以有效降低延迟SLA违规带来的代价,网络通信成本和数据中心运营成本。

更新日期:2020-08-26
down
wechat
bug