当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data-driven Derivation of an Analytic Model for Parallel Servers with Job Replication
IEEE Transactions on Parallel and Distributed Systems ( IF 5.3 ) Pub Date : 2020-10-01 , DOI: 10.1109/tpds.2020.2992571
Noor Bajunaid , Daniel A. Menasce

The job replication problem has been studied recently as a mechanism to improve performance and availability of systems with $n$n parallel servers, each with its own queue. A dispatcher using some policy sends $d ~(1 \leq d \leq n)$d(1dn) copies of a job to $d$d of the servers. Copies are eliminated from the system as soon as the first copy completes from any of the $d$d servers. This article introduces a data-driven method to derive closed-form expressions for the average response time and other metrics of jobs as a function of the degree of replication $d$d. This method consists of developing a simulator for the system in order to generate a very large number of datasets for a wide range of input parameters. A statistical and visualization analysis of the data provides the analytical models. It is important to emphasize the difference between using simulation methods to obtain the value of metrics (e.g., average response time) of a computer system given values of input parameters and using our data-driven method to obtain closed-form expressions that relate output metrics to input parameters. The latter is the focus of our approach. The analysis presented here covers results for homogeneous and heterogeneous servers with exponentially distributed service times and for homogeneous servers with hypo-exponentially and hyper-exponentially distributed service times. This article also presents a closed-form equation for the optimal replication degree for the case of homogeneous servers with hypo-exponentially distributed service times.

中文翻译:

具有作业复制的并行服务器分析模型的数据驱动推导

作业复制问题最近被研究作为一种机制来提高系统的性能和可用性 $n$n并行服务器,每个服务器都有自己的队列。使用某些策略的调度员发送$d ~(1 \leq d \leq n)$d(1dn) 一份工作的副本 $d$d的服务器。一旦第一个副本从任何一个副本中完成,副本就会从系统中删除$d$d服务器。本文介绍了一种数据驱动的方法,用于导出作业的平均响应时间和其他指标的闭式表达式,作为复制程度的函数$d$d. 该方法包括为系统开发模拟器,以便为各种输入参数生成大量数据集。数据的统计和可视化分析提供了分析模型。重要的是要强调使用模拟方法来获得价值 给定输入参数值并使用我们的数据驱动方法获得计算机系统的指标(例如,平均响应时间) 封闭形式将输出指标与输入参数相关联的表达式。后者是我们方法的重点。此处介绍的分析涵盖了服务时间呈指数分布的同构和异构服务器的结果,以及服务时间呈低指数和超指数分布的同构服务器的结果。本文还针对具有次指数分布服务时间的同构服务器的情况,提出了最优复制度的闭式方程。
更新日期:2020-10-01
down
wechat
bug