当前位置: X-MOL 学术Telecommun. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data centers’ services restoration based on the decision-making of distributed agents
Telecommunication Systems ( IF 1.7 ) Pub Date : 2020-03-14 , DOI: 10.1007/s11235-020-00660-2
Príscila Alves Lima , Antônio Sá Barreto Neto , Paulo Maciel

The increasing number of companies that are migrating their IT infrastructure to cloud environments has been motivated many studies on distributed backup strategies to improve the availability of these companies’ systems. In this scenario, it is essential to study mechanisms to evaluate the network conditions to minimize the transmission time to improve the availability of the system. The goal of this study is to build models to evaluate the availability of services running in cloud data center infrastructure, emphasizing the impact of the variation of throughput on the data redundancy, and consequently, on the availability of the service. Based on it, this research purposes some smart models which can be deployed in each data center of a distributed arrange of data centers and help the system administrator to choose the best data center to restore the services of a faulty one. To analyze the impact of the network throughput over the service’s availability, we gathered the MTTF and MTTR metrics of data center’s components and services, generated a reliability block diagram to get the MTTF of the system as a whole, and developed a formalism to model the network component. Based on the results, we built an SPN model to represent the system and get the availability of it in many network conditions. After that, we analyze the availability of the system to discuss the impact of the network conditions over the system’s availability. After building the models and get the system’s availability in many network conditions, we can perceive the enormous impact of the network conditions over the system’s availability through a plot that exhibits the annual downtime along of a year. Using the models developed to study the system availability, we developed smart agents capable of predicting the transfer time of a bulk of data and, with it, choose the data center with the best network conditions to restore the services of a faulty one.



中文翻译:

基于分布式代理决策的数据中心服务恢复

越来越多的公司将其IT基础架构迁移到云环境,这激发了许多有关分布式备份策略的研究,以提高这些公司的系统的可用性。在这种情况下,必须研究评估网络状况的机制,以最大程度地减少传输时间,以提高系统的可用性。这项研究的目的是建立模型来评估在云数据中心基础架构中运行的服务的可用性,强调吞吐量变化对数据冗余的影响,并因此对服务可用性的影响。基于此,这项研究旨在建立一些智能模型,这些模型可以部署在分布式数据中心的每个数据中心中,并帮助系统管理员选择最佳的数据中心以恢复有故障的数据中心的服务。为了分析网络吞吐量对服务可用性的影响,我们收集了数据中心组件和服务的MTTF和MTTR指标,生成了可靠性框图以获取整个系统的MTTF,并开发了形式化模型来对网络组件。根据结果​​,我们建立了一个SPN模型来表示系统,并在许多网络条件下获得其可用性。之后,我们分析系统的可用性,以讨论网络状况对系统可用性的影响。建立模型并获得许多网络条件下的系统可用性之后,我们可以通过显示一年中每年的停机时间的图表来了解网络条件对系统可用性的巨大影响。使用开发的模型来研究系统可用性,我们开发了能够预测大量数据传输时间的智能代理,并选择具有最佳网络条件的数据中心来恢复故障数据的服务。

更新日期:2020-03-14
down
wechat
bug