当前位置: X-MOL 学术arXiv.cs.PF › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the performance of GPU accelerated q-LSKUM based meshfree solvers in Fortran, C++, Python, and Julia
arXiv - CS - Performance Pub Date : 2021-08-16 , DOI: arxiv-2108.07031
Nischay Ram Mamidi, Kumar Prasun, Dhruv Saxena, Anil Nemili, Bharatkumar Sharma, S. M. Deshpande

This report presents a comprehensive analysis of the performance of GPU accelerated meshfree CFD solvers for two-dimensional compressible flows in Fortran, C++, Python, and Julia. The programming model CUDA is used to develop the GPU codes. The meshfree solver is based on the least squares kinetic upwind method with entropy variables (q-LSKUM). To assess the computational efficiency of the GPU solvers and to compare their relative performance, benchmark calculations are performed on seven levels of point distribution. To analyse the difference in their run-times, the computationally intensive kernel is profiled. Various performance metrics are investigated from the profiled data to determine the cause of observed variation in run-times. To address some of the performance related issues, various optimisation strategies are employed. The optimised GPU codes are compared with the naive codes, and conclusions are drawn from their performance.

中文翻译:

基于 GPU 加速 q-LSKUM 的无网格求解器在 Fortran、C++、Python 和 Julia 中的性能

本报告对 Fortran、C++、Python 和 Julia 中用于二维可压缩流的 GPU 加速无网格 CFD 求解器的性能进行了全面分析。编程模型 CUDA 用于开发 GPU 代码。无网格求解器基于具有熵变量的最小二乘动力学逆风方法 (q-LSKUM)。为了评估 GPU 求解器的计算效率并比较它们的相对性能,在七个点分布级别上执行基准计算。为了分析它们运行时间的差异,对计算密集型内核进行了分析。从分析数据中调查各种性能指标,以确定观察到的运行时间变化的原因。为了解决一些与性能相关的问题,采用了各种优化策略。
更新日期:2021-08-17
down
wechat
bug