当前位置: X-MOL 学术Computing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GPU-based matrix-free finite element solver exploiting symmetry of elemental matrices
Computing ( IF 3.7 ) Pub Date : 2020-06-24 , DOI: 10.1007/s00607-020-00827-4
Utpal Kiran , Sachin Singh Gautam , Deepak Sharma

Matrix-free solvers for finite element method (FEM) avoid assembly of elemental matrices and replace sparse matrix-vector multiplication required in iterative solution method by an element level dense matrix-vector product. In this paper, a novel matrix-free strategy for FEM is proposed which computes element level matrix-vector product by using only the symmetric part of the elemental matrices. The proposed strategy is developed to take advantage of the massive parallelism of Graphics Processing Unit (GPU). A unique data structure is also introduced which ensures localized and coalesced memory access suitable for a GPU while storing only the symmetric part of the elemental matrices. In addition, the proposed strategy emphasizes the efficient use of register cache, uniform workload distribution, reducing thread synchronization, and maintaining sufficient granularity to make the best use of GPU resources. The performance of the proposed strategy is evaluated by solving elasticity and heat conduction problems using 4-noded quadrilateral element with two degrees of freedom (DOFs) and one DOF per node, respectively. The performance is compared with the matrix-free solver strategies on GPU from the literature. It is found that a maximum speedup of 4.9 ×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times $$\end{document} is obtained for the elasticity problem and a maximum of 3.2 ×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times $$\end{document} speedup for the heat conduction problem. Further, the proposed strategy takes the least amount of GPU memory as compared to the existing strategies.

中文翻译:

基于 GPU 的无矩阵有限元求解器利用元素矩阵的对称性

用于有限元法 (FEM) 的无矩阵求解器避免了元素矩阵的组装,并用元素级密集矩阵向量乘积替换了迭代求解方法中所需的稀疏矩阵向量乘法。在本文中,提出了一种新的 FEM 无矩阵策略,该策略仅使用元素矩阵的对称部分来计算元素级矩阵向量乘积。所提出的策略是为了利用图形处理单元 (GPU) 的大规模并行性而开发的。还引入了一种独特的数据结构,可确保适合 GPU 的本地化和合并内存访问,同时仅存储元素矩阵的对称部分。此外,所提出的策略强调寄存器缓存的高效使用、统一的工作负载分配、减少线程同步、并保持足够的粒度以充分利用 GPU 资源。通过分别使用具有两个自由度 (DOF) 和每个节点一个自由度的 4 节点四边形单元解决弹性和热传导问题来评估所提出策略的性能。性能与文献中 GPU 上的无矩阵求解器策略进行了比较。发现最大加速比为 4.9 ×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage {upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times $$\end{document} 是为弹性问题求得的,最大为 3。2 ×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin }{-69pt} \begin{document}$$\times $$\end{document} 热传导问题的加速。此外,与现有策略相比,所提出的策略占用最少的 GPU 内存。
更新日期:2020-06-24
down
wechat
bug