当前位置: X-MOL 学术Cluster Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
TOP-Storm: A topology-based resource-aware scheduler for Stream Processing Engine
Cluster Computing ( IF 4.4 ) Pub Date : 2020-05-06 , DOI: 10.1007/s10586-020-03117-y
Asif Muhammad , Muhammad Aleem , Muhammad Arshad Islam

Like other emerging fields, Stream Processing Engines (SPEs) pose several challenges to the researchers e.g., resource awareness, dynamic configurations, heterogeneous clusters, load balancing, and topology awareness. All of these aspects play a major role in the job scheduling process. Currently, SPEs ignore topology’s structure while scheduling. Due to this, frequently communicating tasks may end up at different computing nodes which causes problems for achieving the maximum throughput. In this paper, TOP-Storm—a scheduler based on topology’s DAG (Directed Acyclic Graph) is proposed for Apache Storm (a popular open-source SPE) that optimize resource usage for heterogeneous clusters. The aim is to improve efficiency using resource-aware task assignments that results in enhanced throughput and optimize resource utilization. TOP-Storm is divided into two phases: In the first phase, executors are logically grouped with the help of DAG to minimize inter-group communication. In the second phase, these groups are assigned to physical nodes starting from the most powerful node. Results are generated with the help of two benchmark topologies and results are compared with two state-of-the-art scheduling algorithms. Experiment results show up to 39% and 11% improvement in throughput as compared to the default Apache Storm scheduler and R-Storm, respectively.



中文翻译:

TOP-Storm:用于流处理引擎的基于拓扑的资源感知调度程序

与其他新兴领域一样,流处理引擎(SPE)对研究人员提出了一些挑战,例如资源意识,动态配置,异构集群,负载平衡和拓扑意识。所有这些方面在作业调度过程中起着重要作用。当前,SPE在调度时会忽略拓扑的结构。因此,频繁通信的任务可能会在不同的计算节点处结束,这会导致实现最大吞吐量的问题。在本文中,为Apache Storm(一种流行的开源SPE)提出了一种基于拓扑DAG(有向非循环图)的调度程序TOP-Storm,该调度程序优化了异构集群的资源使用。目的是使用可感知资源的任务分配来提高效率,从而提高吞吐量并优化资源利用率。TOP-Storm分为两个阶段:在第一阶段,执行者在DAG的帮助下进行逻辑分组,以最大程度地减少组间通信。在第二阶段,这些组从最强大的节点开始分配给物理节点。在两种基准拓扑的帮助下生成结果,并将结果与​​两种最新的调度算法进行比较。实验结果表明,与默认的Apache Storm调度程序和R-Storm相比,吞吐量分别提高了39%和11%。在两种基准拓扑的帮助下生成结果,并将结果与​​两种最新的调度算法进行比较。实验结果表明,与默认的Apache Storm调度程序和R-Storm相比,吞吐量分别提高了39%和11%。在两种基准拓扑的帮助下生成结果,并将结果与​​两种最新的调度算法进行比较。实验结果表明,与默认的Apache Storm调度程序和R-Storm相比,吞吐量分别提高了39%和11%。

更新日期:2020-05-06
down
wechat
bug