当前位置: X-MOL 学术Comput. Electr. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Stargazer: Toward efficient data analytics scheduling via task completion time inference
Computers & Electrical Engineering ( IF 4.3 ) Pub Date : 2021-04-08 , DOI: 10.1016/j.compeleceng.2021.107092
Haizhou Du , Keke Zhang , Qiao Xiang

The fundamental challenge of data analytics scheduling is the heterogeneity of both data analytics jobs and resources. Although many scheduling solutions have been developed to improve the efficiency of data analytics frameworks (e.g., Spark), they either (1) focus on the scheduling of a single type of resource, without considering the coordination between different resources; or (2) schedule multiple resources by factoring in limited information about analytics jobs without considering the heterogeneity of resources. This paper presents Stargazer, a novel, efficient system that tackles diversity data analytics jobs on heterogeneous cluster by inferring the completion times of their decomposed tasks. Specifically, Stargazer adopts a deep learning model, which takes into considerations multiple key factors of diversity data analytics jobs and heterogeneous resources, to accurately infer the completion time of different tasks. A prototype of Stargazer is fully implemented in the Spark framework. Extensive experiments show that Stargazer can reduce the average job completion time by 21% and improve average performance by 20%, while incurring little overhead.



中文翻译:

Stargazer:通过任务完成时间推断实现高效的数据分析调度

数据分析调度的根本挑战是数据分析作业和资源的异构性。尽管已经开发了许多调度解决方案来提高数据分析框架的效率(例如(Spark),他们要么(1)专注于单一类型资源的调度,而无需考虑不同资源之间的协调;或(2)通过考虑有关分析作业的有限信息来调度多个资源,而无需考虑资源的异构性。本文介绍了Stargazer,这是一个新颖高效的系统,可通过推断异构任务集群的分解任务的完成时间来解决其多样性数据分析工作。具体来说,Stargazer采用深度学习模型,该模型考虑了多样性数据分析工作和异构资源的多个关键因素,以准确推断出不同任务的完成时间。Stargazer的原型已在Spark框架中完全实现。

更新日期:2021-04-08
down
wechat
bug