当前位置: X-MOL 学术arXiv.cs.MS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Refactoring the MPS/University of Chicago Radiative MHD(MURaM) Model for GPU/CPU Performance Portability Using OpenACC Directives
arXiv - CS - Mathematical Software Pub Date : 2021-07-16 , DOI: arxiv-2107.08145
Eric Wright, Damien Przybylski, Matthias Rempel, Cena Miller, Supreeth Suresh, Shiquan Su, Richard Loft, Sunita Chandrasekaran

The MURaM (Max Planck University of Chicago Radiative MHD) code is a solar atmosphere radiative MHD model that has been broadly applied to solar phenomena ranging from quiet to active sun, including eruptive events such as flares and coronal mass ejections. The treatment of physics is sufficiently realistic to allow for the synthesis of emission from visible light to extreme UV and X-rays, which is critical for a detailed comparison with available and future multi-wavelength observations. This component relies critically on the radiation transport solver (RTS) of MURaM; the most computationally intensive component of the code. The benefits of accelerating RTS are multiple fold: A faster RTS allows for the regular use of the more expensive multi-band radiation transport needed for comparison with observations, and this will pave the way for the acceleration of ongoing improvements in RTS that are critical for simulations of the solar chromosphere. We present challenges and strategies to accelerate a multi-physics, multi-band MURaM using a directive-based programming model, OpenACC in order to maintain a single source code across CPUs and GPUs. Results for a $288^3$ test problem show that MURaM with the optimized RTS routine achieves 1.73x speedup using a single NVIDIA V100 GPU over a fully subscribed 40-core Intel Skylake CPU node and with respect to the number of simulation points (in millions) per second, a single NVIDIA V100 GPU is equivalent to 69 Skylake cores. We also measure parallel performance on up to 96 GPUs and present weak and strong scaling results.

中文翻译:

使用 OpenACC 指令重构 MPS/芝加哥大学辐射 MHD(MURaM) 模型以实现 GPU/CPU 性能可移植性

MURaM(芝加哥马克斯普朗克大学辐射 MHD)代码是一种太阳大气辐射 MHD 模型,已广泛应用于从安静到活跃太阳的太阳现象,包括爆发事件,如耀斑和日冕物质抛射。物理学的处理足够现实,可以合成从可见光到极端紫外线和 X 射线的发射,这对于与现有和未来的多波长观测进行详细比较至关重要。该组件严重依赖于 MURaM 的辐射传输求解器 (RTS);代码中计算量最大的组件。加速 RTS 的好处是多方面的:更快的 RTS 允许定期使用更昂贵的多波段辐射传输,以便与观测进行比较,这将为加速 RTS 的持续改进铺平道路,这对于模拟太阳色球层至关重要。我们提出了使用基于指令的编程模型 OpenACC 来加速多物理场、多频段 MURaM 的挑战和策略,以便在 CPU 和 GPU 之间维护单一源代码。288 美元 3 美元的测试问题的结果表明,在完全订阅的 40 核 Intel Skylake CPU 节点上,使用单个 NVIDIA V100 GPU 并在模拟点数量(以百万计)上,具有优化 RTS 例程的 MURaM ) 每秒,单个 NVIDIA V100 GPU 相当于 69 个 Skylake 内核。我们还测量了多达 96 个 GPU 上的并行性能,并呈现了弱扩展和强扩展的结果。
更新日期:2021-07-20
down
wechat
bug