当前位置: X-MOL 学术Int. J. Numer. Methods Fluids › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scaled ILU smoothers for Navier–Stokes pressure projection
International Journal for Numerical Methods in Fluids ( IF 1.8 ) Pub Date : 2023-12-28 , DOI: 10.1002/fld.5254
Stephen Thomas 1 , Arielle Carr 2 , Paul Mullowney 1 , Katarzyna Świrydowicz 3 , Marcus Day 4

Incomplete LU (ILU) smoothers are effective in the algebraic multigrid (AMG) -cycle for reducing high-frequency components of the error. However, the requisite direct triangular solves are comparatively slow on GPUs. Previous work has demonstrated the advantages of Jacobi iteration as an alternative to direct solution of these systems. Depending on the threshold and fill-level parameters chosen, the factors can be highly nonnormal and Jacobi is unlikely to converge in a low number of iterations. We demonstrate that row scaling can reduce the departure from normality, allowing us to replace the inherently sequential solve with a rapidly converging Richardson iteration. There are several advantages beyond the lower compute time. Scaling is performed locally for a diagonal block of the global matrix because it is applied directly to the factor. Further, an ILUT Schur complement smoother maintains a constant GMRES iteration count as the number of MPI ranks increases, and thus parallel strong-scaling is improved. Our algorithms have been incorporated into hypre, and we demonstrate improved time to solution for linear systems arising in the Nalu-Wind and PeleLM pressure solvers. For large problem sizes, GMRESAMG executes at least five times faster when using iterative triangular solves compared with direct solves on massively parallel GPUs.


用于纳维-斯托克斯压力投影的缩放 ILU 平滑器

不完全 LU (ILU) 平滑器在代数多重网格 (AMG) 中有效-用于减少误差高频分量的周期。然而,必要的直接三角求解在 GPU 上相对较慢。之前的工作已经证明了雅可比迭代作为这些系统直接求解的替代方案的优势。根据所选择的阈值和填充水平参数,这些因子可能是高度非正态的,并且雅可比不太可能在少量迭代中收敛。我们证明行缩放可以减少对正态性的偏离,使我们能够用快速收敛的理查森迭代代替固有的顺序求解。除了较短的计算时间之外,还有其他几个优点。缩放是针对全局矩阵的对角线块在本地执行的,因为它直接应用于因子。此外,随着 MPI 等级数量的增加,ILUT Schur 补平滑器保持恒定的 GMRES 迭代计数,从而改进了并行强缩放。我们的算法已被纳入 hypre,并且我们展示了 Nalu-Wind 和 PeleLM 压力求解器中线性系统求解时间的缩短。对于大问题规模,GMRES与大规模并行 GPU 上的直接求解相比,使用迭代三角求解时,AMG 的执行速度至少快五倍。