Machine Learning–enabled Scalable Performance Prediction of Scientific Codes,ACM Transactions on Modeling and Computer Simulation

当前位置： X-MOL 学术 › ACM Trans. Model. Comput. Simul. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Machine Learning–enabled Scalable Performance Prediction of Scientific Codes
ACM Transactions on Modeling and Computer Simulation ( IF 0.7 ) Pub Date : 2021-04-23 , DOI: 10.1145/3450264
Gopinath Chennupati ₁ , Nandakishore Santhi ₁ , Phill Romero ₁ , Stephan Eidenbenz ₁

Affiliation

Hardware architectures become increasingly complex as the compute capabilities grow to exascale. We present the Analytical Memory Model with Pipelines (AMMP) of the Performance Prediction Toolkit (PPT). PPT-AMMP takes high-level source code and hardware architecture parameters as input and predicts runtime of that code on the target hardware platform, which is defined in the input parameters. PPT-AMMP transforms the code to an (architecture-independent) intermediate representation, then (i) analyzes the basic block structure of the code, (ii) processes architecture-independent virtual memory access patterns that it uses to build memory reuse distance distribution models for each basic block, and (iii) runs detailed basic-block level simulations to determine hardware pipeline usage. PPT-AMMP uses machine learning and regression techniques to build the prediction models based on small instances of the input code, then integrates into a higher-order discrete-event simulation model of PPT running on Simian PDES engine. We validate PPT-AMMP on four standard computational physics benchmarks and present a use case of hardware parameter sensitivity analysis to identify bottleneck hardware resources on different code inputs. We further extend PPT-AMMP to predict the performance of a scientific application code, namely, the radiation transport mini-app SNAP. To this end, we analyze multi-variate regression models that accurately predict the reuse profiles and the basic block counts. We validate predicted SNAP runtimes against actual measured times.

中文翻译：

支持机器学习的科学代码的可扩展性能预测

随着计算能力增长到百亿亿级，硬件架构变得越来越复杂。我们展示了性能预测工具包 (PPT) 的流水线分析内存模型 (AMMP)。PPT-AMMP 将高级源代码和硬件架构参数作为输入，并预测该代码在目标硬件平台上的运行时间，这在输入参数中定义。PPT-AMMP 将代码转换为（与架构无关的）中间表示，然后 (i) 分析代码的基本块结构，(ii) 处理与架构无关的虚拟内存访问模式，用于构建内存重用距离分布模型对于每个基本块，以及 (iii) 运行详细的基本块级模拟以确定硬件流水线的使用情况。PPT-AMMP 使用机器学习和回归技术，基于输入代码的小实例构建预测模型，然后集成到 Simian PDES 引擎上运行的 PPT 的高阶离散事件仿真模型中。我们在四个标准计算物理基准上验证了 PPT-AMMP，并展示了一个硬件参数敏感性分析的用例，以识别不同代码输入上的瓶颈硬件资源。我们进一步扩展 PPT-AMMP 来预测科学应用程序代码的性能，即辐射传输迷你应用程序 SNAP。为此，我们分析了准确预测重用配置文件和基本块数的多变量回归模型。我们根据实际测量时间验证预测的 SNAP 运行时间。然后集成到在 Simian PDES 引擎上运行的 PPT 的高阶离散事件仿真模型中。我们在四个标准计算物理基准上验证了 PPT-AMMP，并展示了一个硬件参数敏感性分析的用例，以识别不同代码输入上的瓶颈硬件资源。我们进一步扩展 PPT-AMMP 来预测科学应用程序代码的性能，即辐射传输迷你应用程序 SNAP。为此，我们分析了准确预测重用配置文件和基本块数的多变量回归模型。我们根据实际测量时间验证预测的 SNAP 运行时间。然后集成到在 Simian PDES 引擎上运行的 PPT 的高阶离散事件仿真模型中。我们在四个标准计算物理基准上验证了 PPT-AMMP，并展示了一个硬件参数敏感性分析的用例，以识别不同代码输入上的瓶颈硬件资源。我们进一步扩展 PPT-AMMP 来预测科学应用程序代码的性能，即辐射传输迷你应用程序 SNAP。为此，我们分析了准确预测重用配置文件和基本块数的多变量回归模型。我们根据实际测量时间验证预测的 SNAP 运行时间。我们在四个标准计算物理基准上验证了 PPT-AMMP，并展示了一个硬件参数敏感性分析的用例，以识别不同代码输入上的瓶颈硬件资源。我们进一步扩展 PPT-AMMP 来预测科学应用程序代码的性能，即辐射传输迷你应用程序 SNAP。为此，我们分析了准确预测重用配置文件和基本块数的多变量回归模型。我们根据实际测量时间验证预测的 SNAP 运行时间。我们在四个标准计算物理基准上验证了 PPT-AMMP，并展示了一个硬件参数敏感性分析的用例，以识别不同代码输入上的瓶颈硬件资源。我们进一步扩展 PPT-AMMP 来预测科学应用程序代码的性能，即辐射传输迷你应用程序 SNAP。为此，我们分析了准确预测重用配置文件和基本块数的多变量回归模型。我们根据实际测量时间验证预测的 SNAP 运行时间。我们分析了准确预测重用配置文件和基本块数的多变量回归模型。我们根据实际测量时间验证预测的 SNAP 运行时间。我们分析了准确预测重用配置文件和基本块数的多变量回归模型。我们根据实际测量时间验证预测的 SNAP 运行时间。

更新日期：2021-04-23

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11