当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrating clustering and regression for workload estimation in the cloud
Concurrency and Computation: Practice and Experience ( IF 2 ) Pub Date : 2020-07-20 , DOI: 10.1002/cpe.5931
Yongjia Yu 1 , Vasu Jindal 2 , I‐Ling Yen 2 , Farokh Bastani 2 , Jie Xu 3 , Peter Garraghan 4
Affiliation  

Workload prediction has been widely researched in the literature. However, existing techniques are per‐job based and useful for service‐like tasks whose workloads exhibit seasonality and trend. But cloud jobs have many different workload patterns and some do not exhibit recurring workload patterns. We consider job‐pool‐based workload estimation, which analyzes the characteristics of existing tasks' workloads to estimate the currently running tasks' workload. First cluster existing tasks based on their workloads. For a new task J, collect the initial workload of J and determine which cluster J may belong to, then use the cluster's characteristics to estimate J′s workload. Based on the Google dataset, the algorithm is experimentally evaluated and its effectiveness is confirmed. However, the workload patterns of some tasks do have seasonality and trend, and conventional per‐job‐based regression methods may yield better workload prediction results. Also, in some cases, some new tasks may not follow the workload patterns of existing tasks in the pool. Thus, develop an integrated scheme which combines clustering and regression and utilize the best of them for workload prediction. Experimental study shows that the combined approach can further improve the accuracy of workload prediction.

中文翻译:

集成聚类和回归以在云中进行工作负载估计

工作量预测在文献中得到了广泛的研究。然而,现有的技术是基于每个工作的,对于工作负载呈现季节性和趋势的服务类任务很有用。但是云作业有许多不同的工作负载模式,有些不表现出重复的工作负载模式。我们考虑基于作业池的工作量估计,它分析现有任务的工作量特征来估计当前正在运行的任务的工作量。首先根据工作负载对现有任务进行集群。对于一个新任务J,收集J的初始工作量,确定J可能属于哪个集群,然后利用集群的特征来估计J的工作量。基于谷歌数据集,对该算法进行了实验评估,并证实了其有效性。然而,一些任务的工作量模式确实有季节性和趋势,传统的基于每个工作的回归方法可能会产生更好的工作量预测结果。此外,在某些情况下,某些新任务可能不遵循池中现有任务的工作负载模式。因此,开发一种结合聚类和回归的集成方案,并利用它们中的优点进行工作负载预测。实验研究表明,该组合方法可以进一步提高工作量预测的准确性。
更新日期:2020-07-20
down
wechat
bug