当前位置:
X-MOL 学术
›
arXiv.cs.MS
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Highly Efficient Lattice-Boltzmann Multiphase Simulations of Immiscible Fluids at High-Density Ratios on CPUs and GPUs through Code Generation
arXiv - CS - Mathematical Software Pub Date : 2020-12-11 , DOI: arxiv-2012.06144 Markus Holzer, Martin Bauer, Ulrich Rüde
arXiv - CS - Mathematical Software Pub Date : 2020-12-11 , DOI: arxiv-2012.06144 Markus Holzer, Martin Bauer, Ulrich Rüde
A high-performance implementation of a multiphase lattice Boltzmann method
based on the conservative Allen-Cahn model supporting high-density ratios and
high Reynolds numbers is presented. Metaprogramming techniques are used to
generate optimized code for CPUs and GPUs automatically. The coupled model is
specified in a high-level symbolic description and optimized through automatic
transformations. The memory footprint of the resulting algorithm is reduced
through the fusion of compute kernels. A roofline analysis demonstrates the
excellent efficiency of the generated code on a single GPU. The resulting
single GPU code has been integrated into the multiphysics framework waLBerla to
run massively parallel simulations on large domains. Communication hiding and
GPUDirect-enabled MPI yield near-perfect scaling behaviour. Scaling experiments
are conducted on the Piz Daint supercomputer with up to 2048 GPUs, simulating
several hundred fully resolved bubbles. Further, validation of the
implementation is shown in a physically relevant scenario-a three-dimensional
rising air bubble in water.
中文翻译:
通过代码生成在CPU和GPU上以高密度比对不混溶流体进行高效的Lattice-Boltzmann多相仿真
提出了一种基于保守Allen-Cahn模型的多相晶格Boltzmann方法的高性能实现,该模型支持高密度比和高雷诺数。元编程技术用于自动为CPU和GPU生成优化的代码。耦合模型在高级符号描述中指定,并通过自动转换进行了优化。通过融合计算内核,减少了所得算法的内存占用量。屋顶分析显示了在单个GPU上生成代码的出色效率。生成的单个GPU代码已集成到多物理场框架waLBerla中,以在大型域上运行大规模并行仿真。通信隐藏和启用GPUDirect的MPI产生了近乎完美的缩放行为。在具有多达2048个GPU的Piz Daint超级计算机上进行缩放实验,模拟了数百个完全分解的气泡。此外,在物理上相关的场景中显示了实现的验证-水中的三维上升气泡。
更新日期:2020-12-14
中文翻译:
通过代码生成在CPU和GPU上以高密度比对不混溶流体进行高效的Lattice-Boltzmann多相仿真
提出了一种基于保守Allen-Cahn模型的多相晶格Boltzmann方法的高性能实现,该模型支持高密度比和高雷诺数。元编程技术用于自动为CPU和GPU生成优化的代码。耦合模型在高级符号描述中指定,并通过自动转换进行了优化。通过融合计算内核,减少了所得算法的内存占用量。屋顶分析显示了在单个GPU上生成代码的出色效率。生成的单个GPU代码已集成到多物理场框架waLBerla中,以在大型域上运行大规模并行仿真。通信隐藏和启用GPUDirect的MPI产生了近乎完美的缩放行为。在具有多达2048个GPU的Piz Daint超级计算机上进行缩放实验,模拟了数百个完全分解的气泡。此外,在物理上相关的场景中显示了实现的验证-水中的三维上升气泡。