当前位置: X-MOL 学术Future Gener. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
I-Scheduler: Iterative scheduling for distributed stream processing systems
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2020-11-23 , DOI: 10.1016/j.future.2020.11.011
Leila Eskandari , Jason Mair , Zhiyi Huang , David Eyers

Task allocation in Data Stream Processing Systems (DSPSs) has a significant impact on performance metrics such as data processing latency and system throughput. An application processed by DSPSs can be represented as a Directed Acyclic Graph (DAG), where each vertex represents a task and the edges show the dataflow between the tasks. Task allocation can be defined as the assignment of the vertices in the DAG to the physical compute nodes such that the data movement between the nodes is minimised. Finding an optimal task placement for DSPSs is NP-hard. Thus, approximate scheduling approaches are required to improve the performance of DSPSs. In this paper, we propose a heuristic scheduling algorithm which reliably and efficiently finds highly communicating tasks by exploiting graph partitioning algorithms and a mathematical optimisation software package. We evaluate the communication cost of our method using three micro-benchmarks, showing that we can achieve results that are close to optimal. We further compare our scheduler with two popular existing schedulers, R-Storm and Aniello et al.’s ‘Online scheduler’ using two real-world applications. Our experimental results show that our proposed scheduler outperforms R-Storm, increasing throughput by up to 30%, and improves on the Online scheduler by 20%–86% as a result of finding a more efficient schedule.1



中文翻译:

I-Scheduler:分布式流处理系统的迭代调度

数据流处理系统(DSPS)中的任务分配对性能指标(如数据处理延迟和系统吞吐量)有重大影响。DSPS处理的应用程序可以表示为有向非循环图(DAG),其中每个顶点表示一个任务,边表示任务之间的数据流。任务分配可以定义为将DAG中的顶点分配给物理计算节点,从而使节点之间的数据移动最小化。为DSPS寻找最佳任务位置是NP难题。因此,需要近似的调度方法来提高DSPS的性能。在本文中,我们提出了一种启发式调度算法,该算法通过利用图划分算法和数学优化软件包来可靠,有效地找到高度通信的任务。我们使用三个微观基准评估了我们方法的通信成本,表明我们可以获得的结果接近最佳。我们进一步将调度程序与使用两个实际应用程序的两个流行的现有调度程序R-Storm和Aniello等人的“在线调度程序”进行比较。我们的实验结果表明,我们的调度程序优于R-Storm,将吞吐量提高了30%,并且由于找到了更有效的调度,在线调度程序上的性能提高了20%–86%。我们进一步将调度程序与使用两个实际应用程序的两个流行的现有调度程序R-Storm和Aniello等人的“在线调度程序”进行比较。我们的实验结果表明,我们的调度程序优于R-Storm,将吞吐量提高了30%,并且由于找到了更有效的调度,在线调度程序上的性能提高了20%–86%。我们还将使用两个实际应用程序将我们的调度程序与两个流行的现有调度程序(R-Storm和Aniello等人的“在线调度程序”)进行比较。我们的实验结果表明,我们的调度程序优于R-Storm,将吞吐量提高了30%,并且由于找到了更有效的调度,在线调度程序上的性能提高了20%–86%。1个

更新日期:2020-12-11
down
wechat
bug