当前位置: X-MOL 学术J. Parallel Distrib. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient OpenMP parallelization to a complex MPI parallel magnetohydrodynamics code
Journal of Parallel and Distributed Computing ( IF 3.8 ) Pub Date : 2020-02-13 , DOI: 10.1016/j.jpdc.2020.02.004
Hongyang Zhou , Gábor Tóth

The state-of-the-art finite volume/difference magnetohydrodynamics (MHD) code Block Adaptive Tree Solarwind Roe Upwind Scheme (BATS-R-US) was originally designed with pure MPI parallelization. The maximum problem size achievable was limited by the storage requirements of the block tree structure. To mitigate this limitation, we have added multi-threaded OpenMP parallelization to the previous pure MPI implementation. We opted to use a coarse-grained approach by making the loops over grid blocks multi-threaded and have succeeded in making BATS-R-US an efficient hybrid parallel code with modest changes in the source code while preserving the performance. Good weak scalings up to hundreds of thousands of cores were achieved both for explicit and implicit time stepping schemes. This parallelization strategy greatly extended the possible simulation scale from 16,000 cores to more than 500,000 cores with 2GB/core memory on the Blue Waters supercomputer. Our work also revealed significant performance issues for some of the compilers when the code is compiled with the OpenMP library, probably related to the less efficient optimization of a complex multi-threaded region.



中文翻译:

有效的OpenMP并行化到复杂的MPI并行磁流体动力学代码

最先进的有限体积/差异磁流体动力学(MHD)代码块自适应树太阳风向风逆风方案(BATS-R-US)最初是采用纯MPI并行化设计的。可以实现的最大问题大小受到块树结构的存储需求的限制。为了减轻此限制,我们在以前的纯MPI实现中添加了多线程OpenMP并行化。我们选择通过使网格块上的循环成为多线程来使用粗粒度方法,并成功地使BATS-R-US成为有效的混合并行代码,并在保持性能的同时对源代码进行了适度的更改。对于显式和隐式时间步进方案,都可以实现高达数十万个内核的良好的弱缩放。这种并行化策略在Blue Waters超级计算机上以2GB /核的内存将可能的仿真范围从16,000个核大大扩展到了500,000核。我们的工作还揭示了使用OpenMP库编译代码时某些编译器的重大性能问题,这可能与复杂多线程区域的低效率优化有关。

更新日期:2020-02-20
down
wechat
bug