当前位置: X-MOL 学术Qual. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction of high-performance computing input/output variability and its application to optimization for system configurations
Quality Engineering ( IF 1.3 ) Pub Date : 2021-02-18 , DOI: 10.1080/08982112.2020.1866203
Li Xu 1 , Thomas Lux 2 , Tyler Chang 2 , Bo Li 2 , Yili Hong 1 , Layne Watson 2 , Ali Butt 2 , Danfeng Yao 2 , Kirk Cameron 2
Affiliation  

Abstract

Performance variability is an important measure for a reliable high performance computing (HPC) system. Performance variability is affected by complicated interactions between numerous factors, such as CPU frequency, the number of input/output (IO) threads, and the IO scheduler. In this paper, we focus on HPC IO variability. The prediction of HPC variability is a challenging problem in the engineering of HPC systems and there is little statistical work on this problem to date. Although there are many methods available in the computer experiment literature, the applicability of existing methods to HPC performance variability needs investigation, especially, when the objective is to predict performance variability both in interpolation and extrapolation settings. A data analytic framework is developed to model data collected from large-scale experiments. Various promising methods are used to build predictive models for the variability of HPC systems. We evaluate the performance of the methods by measuring prediction accuracy at previously unseen system configurations. We also discuss a methodology for optimizing system configurations that uses the estimated variability map. The findings from method comparisons and developed tool sets in this paper yield new insights into existing statistical methods and can be beneficial for the practice of HPC variability management. This paper has supplementary materials online.



中文翻译:

高性能计算输入/输出可变性的预测及其在系统配置优化中的应用

摘要

性能可变性是衡量可靠的高性能计算 (HPC) 系统的重要指标。性能可变性受多种因素之间复杂的交互影响,例如 CPU 频率、输入/输出 (IO) 线程数和 IO 调度程序。在本文中,我们关注 HPC IO 可变性。HPC 可变性的预测是 HPC 系统工程中的一个具有挑战性的问题,迄今为止关于这个问题的统计工作很少。尽管计算机实验文献中有许多方法可用,但需要调查现有方法对 HPC 性能变异性的适用性,尤其是当目标是预测插值和外推设置中的性能变异性时。开发了一个数据分析框架来对从大规模实验中收集的数据进行建模。各种有前途的方法被用于为 HPC 系统的可变性构建预测模型。我们通过测量以前未见过的系统配置的预测精度来评估这些方法的性能。我们还讨论了一种使用估计变异图优化系统配置的方法。本文中方法比较和开发工具集的发现为现有统计方法提供了新的见解,并且可能有益于 HPC 可变性管理的实践。本文有在线补充材料。我们通过测量以前未见过的系统配置的预测精度来评估这些方法的性能。我们还讨论了一种使用估计变异图优化系统配置的方法。本文中方法比较和开发工具集的结果为现有统计方法提供了新的见解,并且可能有益于 HPC 可变性管理的实践。本文有在线补充材料。我们通过测量以前未见过的系统配置的预测精度来评估这些方法的性能。我们还讨论了一种使用估计变异图优化系统配置的方法。本文中方法比较和开发工具集的发现为现有统计方法提供了新的见解,并且可能有益于 HPC 可变性管理的实践。本文有在线补充材料。

更新日期:2021-02-18
down
wechat
bug