当前位置: X-MOL 学术Comput. Phys. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Semi-Lagrangian Vlasov simulation on GPUs
Computer Physics Communications ( IF 6.3 ) Pub Date : 2020-09-01 , DOI: 10.1016/j.cpc.2020.107351
Lukas Einkemmer

Abstract In this paper, our goal is to efficiently solve the Vlasov equation on GPUs. A semi-Lagrangian discontinuous Galerkin scheme is used for the discretization. Such kinetic computations are extremely expensive due to the high-dimensional phase space. The SLDG code, which is publicly available under the MIT license, abstracts the number of dimensions and uses a shared codebase for both GPU and CPU based simulations. We investigate the performance of the implementation on a range of both Tesla (V100, Titan V, K80) and consumer (GTX 1080 Ti) GPUs. Our implementation is typically able to achieve a performance of approximately 470 GB/s on a single GPU and 1600 GB/s on four V100 GPUs connected via NVLink. This results in a speedup of about a factor of ten (comparing a single GPU with a dual socket Intel Xeon Gold node) and approximately a factor of 35 (comparing a single node with and without GPUs). In addition, we investigate the effect of single precision computation on the performance of the SLDG code and demonstrate that a template based dimension independent implementation can achieve good performance regardless of the dimensionality of the problem.

中文翻译:

GPU 上的半拉格朗日 Vlasov 模拟

摘要 在本文中,我们的目标是在 GPU 上有效地求解 Vlasov 方程。半拉格朗日不连续伽辽金方案用于离散化。由于高维相空间,这种动力学计算非常昂贵。SLDG 代码在 MIT 许可下公开可用,它抽象了维数并使用共享代码库进行基于 GPU 和 CPU 的模拟。我们调查了在一系列 Tesla(V100、Titan V、K80)和消费者(GTX 1080 Ti)GPU 上的实现性能。我们的实施通常能够在单个 GPU 上实现大约 470 GB/s 的性能,在通过 NVLink 连接的四个 V100 GPU 上实现大约 1600 GB/s 的性能。这导致大约 10 倍的加速(将单个 GPU 与双插槽 Intel Xeon Gold 节点进行比较)和大约 35 倍的加速(比较有和没有 GPU 的单个节点)。此外,我们研究了单精度计算对 SLDG 代码性能的影响,并证明无论问题的维度如何,基于模板的维度无关实现都可以获得良好的性能。
更新日期:2020-09-01
down
wechat
bug