当前位置: X-MOL 学术Adv. Eng. Softw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Asynchronous distributed-memory task-parallel algorithm for compressible flows on unstructured 3D Eulerian grids
Advances in Engineering Software ( IF 4.8 ) Pub Date : 2021-07-06 , DOI: 10.1016/j.advengsoft.2020.102962
J. Bakosi 1 , R. Bird 1 , F. Gonzalez 2 , C. Junghans 1 , W. Li 3 , H. Luo 3 , A. Pandare 1 , J. Waltz 1
Affiliation  

We discuss the implementation of a finite element method, used to numerically solve the Euler equations of compressible flows, using an asynchronous runtime system (RTS). The algorithm is implemented for distributed-memory machines, using stationary unstructured 3D meshes, combining data-, and task-parallelism on top of the Charm++ RTS. Charm++’s execution model is asynchronous by default, allowing arbitrary overlap of computation and communication. Task-parallelism allows scheduling parts of an algorithm independently of, or dependent on, each other. Built-in automatic load balancing enables continuous redistribution of computational load by migration of work units based on real-time CPU load measurement. The RTS also features automatic checkpointing, fault tolerance, resilience against hardware failure, and supports power-, and energy-aware computation. We demonstrate scalability up to 25×109 cells at O(104) compute cores and the benefits of automatic load balancing for irregular workloads. The full source code with documentation is available at https://quinoacomputing.org.



中文翻译:

非结构化 3D 欧拉网格上可压缩流的异步分布式内存任务并行算法

我们讨论了有限元方法的实现,该方法用于使用异步运行时系统 (RTS) 对可压缩流的欧拉方程进行数值求解。该算法是为分布式内存机器实现的,使用固定的非结构化 3D 网格,在 Charm++ RTS 之上结合数据和任务并行。Charm++ 的执行模型默认是异步的,允许计算和通信的任意重叠。任务并行允许调度算法的各个部分独立或依赖于彼此。内置的自动负载平衡可以通过基于实时 CPU 负载测量的工作单元迁移来实现计算负载的连续重新分配。RTS 还具有自动检查点、容错、硬件故障恢复能力,并支持电源、和能量感知计算。我们展示了可扩展性高达25×109 细胞在 (104)计算核心以及针对不规则工作负载的自动负载平衡的好处。包含文档的完整源代码可在 https://quinoacomputing.org 上获得。

更新日期:2021-08-21
down
wechat
bug