当前位置: X-MOL 学术Future Gener. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DeGTeC: A deep graph-temporal clustering framework for data-parallel job characterization in data centers
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2022-11-12 , DOI: 10.1016/j.future.2022.11.014
Yi Liang , Kaizhong Chen , Lan Yi , Xing Su , Xiaoming Jin

Complex data-parallel job contains task dependency information defined as Directed Acyclic Graph (DAG). For convenience, the DAG presented data-parallel jobs are named as DAG jobs. The prevalence of DAG jobs in modern data centers has made the scheduling oriented job characterization a big challenge. This paper proposes a deep graph-temporal clustering framework, i.e., DeGTeC, to efficiently categorize DAG jobs leveraging the graphic and temporal information in DAGs. The categorization result can then be naturally used to characterize the resource consumption pattern of DAG jobs. The DeGTeC framework is constructed mainly based on two autoencoders, i.e., TaskAE and JobAE. TaskAE and JobAE contain spectral graph convolutional network (GCN) layers, temporal convolutional network (TCN) layers, and the adaptive pooling layers to help build task embeddings and job embeddings. An extra embedding sorting step takes in the sequential order information and the depth-bias information for job clustering. To our best knowledge, DeGTeC is the first solution to do resource consumption characterization of DAG jobs fully leveraging the task dependencies defined in DAG. Experimental results demonstrate that the DeGTeC framework outperforms the state-of-the-art job resource consumption characterization methods.



中文翻译:

DeGTeC:用于数据中心数据并行作业表征的深度图时序聚类框架

复杂数据并行作业包含定义为有向无环图 (DAG) 的任务依赖信息。为方便起见,DAG 呈现的数据并行作业被命名为 DAG 作业。DAG作业在现代数据中心的盛行使得面向调度的作业表征成为一个巨大的挑战。本文提出了一个深度图时间聚类框架,即 DeGTeC,以利用 DAG 中的图形和时间信息有效地对 DAG 作业进行分类。然后可以自然地使用分类结果来表征 DAG 作业的资源消耗模式。DeGTeC 框架主要基于两个自动编码器构建,即TaskAE 和JobAE。TaskAE和JobAE包含谱图卷积网络(GCN)层,时间卷积网络(TCN)层,以及自适应池层,以帮助构建任务嵌入和作业嵌入。额外的嵌入排序步骤采用顺序信息和深度偏差信息进行作业聚类。据我们所知,DeGTeC 是第一个完全利用 DAG 中定义的任务依赖性来对 DAG 作业进行资源消耗表征的解决方案。实验结果表明,DeGTeC 框架优于最先进的作业资源消耗表征方法。

更新日期:2022-11-12
down
wechat
bug