Accelerating linear solvers for Stokes problems with C++ metaprogramming,Journal of Computational Science

当前位置： X-MOL 学术 › Int. J. Comput. Sci. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Accelerating linear solvers for Stokes problems with C++ metaprogramming
Journal of Computational Science ( IF 3.1 ) Pub Date : 2020-12-29 , DOI: 10.1016/j.jocs.2020.101285
Denis Demidov , Lin Mu , Bin Wang

The efficient solution of large sparse saddle point systems is very important in computational fluid mechanics. The discontinuous Galerkin finite element methods have become increasingly popular for incompressible flow problems but their application is limited due to high computational cost. We describe C++ programming techniques that may help to accelerate linear solvers for such problems. The approach is based on the policy-based design pattern and partial template specialization, and is implemented in the open source AMGCL library. The efficiency is demonstrated with the example of accelerating an iterative solver of a discontinuous Galerkin finite element method for the Stokes problem. The implementation allows selecting algorithmic components of the solver by adjusting template parameters without any changes to the codebase. It is possible to switch the system matrix to use small statically sized blocks to store the nonzero values, or use a mixed precision solution, which results in up to 4 times speedup, and reduces the memory footprint of the algorithm by about 40%. We evaluate both monolithic and composite preconditioning strategies for 3 benchmark problems. The performance of the proposed solution is compared with a multithreaded direct Pardiso solver and a parallel iterative PETSc solver.

中文翻译：

使用C ++元编程来加速Stokes问题的线性求解器

大型稀疏鞍点系统的有效解决方案在计算流体力学中非常重要。对于不可压缩的流动问题，不连续的Galerkin有限元方法已变得越来越流行，但由于计算成本高，其应用受到了限制。我们描述了可帮助加速此类问题的线性求解器的C ++编程技术。该方法基于基于策略的设计模式和部分模板专业化，并在开源AMGCL库中实现。以加速不连续Galerkin有限元方法对Stokes问题的迭代求解器为例演示了效率。该实现允许通过调整模板参数来选择求解器的算法组件，而无需对代码库进行任何更改。可以将系统矩阵切换为使用较小的静态大小的块来存储非零值，或者使用混合精度解决方案，这可以使速度提高多达4倍，并将算法的内存占用量减少约40％。我们评估了3个基准问题的整体和复合预处理策略。将该解决方案的性能与多线程直接Pardiso求解器和并行迭代PETSc求解器进行了比较。

更新日期：2021-01-04

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11