当前位置: X-MOL 学术Lobachevskii J. Math. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Jobs Runtime Forecast for JSCC RAS Supercomputers Using Machine Learning Methods
Lobachevskii Journal of Mathematics Pub Date : 2021-02-04 , DOI: 10.1134/s1995080220120343
G. I. Savin , B. M. Shabanov , D. S. Nikolaev , A. V. Baranov , P. N. Telegin

Abstract

The paper is devoted to machine learning methods and algorithms for the supercomputer jobs execution prediction. The supercomputers statistics shows that the actual runtime of the most of the jobs substantially diverges from the time requested by the user. This reduces the efficiency of scheduling jobs, since an inaccurate job execution time estimation leads to a suboptimal jobs schedule. The job classification is considered, it is based on the difference between the job actual and the requested execution time. Forecast was made on the base of supercomputer multiuser job management system statistics by assigning a submitted job to one of the classes. The statistics of supercomputers MVS-100K and MVS-10P in the Joint Supercomputer Center of the Russian Academy of Sciences (JSCC RAS) was used. The job flow feature ranking by importance was done on the statistical analysis results. The cross-correlation of the most important features was determined. The probability estimates of correct prediction were obtained for selected well-known machine learning algorithms: logistic regression, decision trees, k-nearest neighbors, linear discriminant analysis, support vector machine, random forest, gradient boosting, and feedforward neural network. The best values were obtained using the random forest method.



中文翻译:

使用机器学习方法的JSCC RAS超级计算机作业运行时间预测

摘要

本文致力于用于超级计算机作业执行预测的机器学习方法和算法。超级计算机的统计数据表明,大多数作业的实际运行时间与用户请求的时间大不相同。因为不正确的作业执行时间估计会导致作业计划不理想,这会降低作业调度的效率。考虑到作业分类,它基于作业实际与请求的执行时间之间的差异。通过将提交的作业分配给其中一个类别,在超级计算机多用户作业管理系统统计信息的基础上进行了预测。使用了俄罗斯科学院联合超级计算机中心(JSCC RAS)中的超级计算机MVS-100K和MVS-10P的统计信息。在统计分析结果上按重要性对工作流程特征进行排名。确定了最重要特征的互相关。针对选定的著名机器学习算法,获得了正确预测的概率估计:逻辑回归,决策树,k最近邻,线性判别分析,支持向量机,随机森林,梯度提升和前馈神经网络。使用随机森林方法可获得最佳值。梯度提升和前馈神经网络。使用随机森林方法可获得最佳值。梯度提升和前馈神经网络。使用随机森林方法可获得最佳值。

更新日期:2021-02-05
down
wechat
bug