当前位置: X-MOL 学术Comput. Phys. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dynamic load balancing with enhanced shared-memory parallelism for particle-in-cell codes
Computer Physics Communications ( IF 7.2 ) Pub Date : 2021-02-01 , DOI: 10.1016/j.cpc.2020.107633
Kyle G. Miller , Roman P. Lee , Adam Tableman , Anton Helm , Ricardo A. Fonseca , Viktor K. Decyk , Warren B. Mori

Abstract Furthering our understanding of many of today’s interesting problems in plasma physics – including plasma based acceleration and magnetic reconnection with pair production due to quantum electrodynamic effects – requires large-scale kinetic simulations using particle-in-cell (PIC) codes. However, these simulations are extremely demanding, requiring that contemporary PIC codes be designed to efficiently use a new fleet of exascale computing architectures. To this end, the key issue of parallel load balance across computational nodes must be addressed. We discuss the implementation of dynamic load balancing by dividing the simulation space into many small, self-contained regions or “tiles,” along with shared-memory (e.g., OpenMP) parallelism both over many tiles and within single tiles. The load balancing algorithm can be used with three different topologies, including two space-filling curves. We tested this implementation in the code Osiris and show low overhead and improved scalability with OpenMP thread number on simulations with both uniform load and severe load imbalance. Compared to other load-balancing techniques, our algorithm gives order-of-magnitude improvement in parallel scalability for simulations with severe load imbalance issues.

中文翻译:

具有增强的共享内存并行性的动态负载平衡,用于单元内粒子代码

摘要 为了进一步理解当今等离子体物理学中许多有趣的问题——包括基于等离子体的加速和磁重联以及由于量子电动力学效应产生的对——需要使用细胞内粒子 (PIC) 代码进行大规模动力学模拟。然而,这些模拟要求极高,要求设计当代 PIC 代码以有效使用新的百亿亿级计算架构。为此,必须解决跨计算节点的并行负载平衡的关键问题。我们通过将模拟空间划分为许多小的、自包含的区域或“瓦片”,以及在许多瓦片上和单个瓦片内的共享内存(例如,OpenMP)并行性来讨论动态负载平衡的实现。负载平衡算法可用于三种不同的拓扑结构,包括两条空间填充曲线。我们在代码 Osiris 中测试了此实现,并在具有均匀负载和严重负载不平衡的模拟中显示了 OpenMP 线程数的低开销和改进的可扩展性。与其他负载平衡技术相比,我们的算法为具有严重负载不平衡问题的模拟提供了数量级的并行可扩展性改进。
更新日期:2021-02-01
down
wechat
bug