当前位置: X-MOL 学术J. Supercomput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the design of two-stage multiprojection methods for distributed memory systems
The Journal of Supercomputing ( IF 3.3 ) Pub Date : 2020-02-18 , DOI: 10.1007/s11227-020-03201-5
B. E. Moutafis , G. A. Gravvanis , C. K. Filelis-Papadopoulos

Solving large sparse linear systems, efficiently, on supercomputing infrastructures is a time-consuming component for a wide variety of simulation processes. An effective parallel solver should meet the required specifications, concerning both convergence behavior and scalability. Herewith, a class of two-stage algebraic domain decomposition preconditioning schemes based on the upper Schur complement method is proposed, in order to exploit appropriately distributed memory systems with multicore processors. The design of the method has been focused on homogeneous hybrid parallel systems, i.e., distributed and shared memory systems. However, the proposed method can also be applied to heterogeneous systems, such as cloud infrastructures, or hybrid parallel systems with accelerators, by modifying the workload distribution algorithm and taking into account the different network latencies and bandwidths. The first stage of the proposed schemes is related to the assignment of the subdomains among the workstations of the distributed system, whereas the second stage concerns the further redistribution of the subdomains to each core of a processor. The proposed method utilizes multiprojection techniques, based on semi-aggregated subdomains, leading to improved convergence behavior as the number of subdomains increases. Moreover, a subspace compression technique is used, in order to improve the performance of the preprocessing phase and reduce the memory requirements of the proposed scheme. The preconditioning schemes were combined with a parallel Krylov subspace method, i.e., the parallel preconditioned GMRES(m) method. The convergence behavior, the performance and the scalability of the proposed preconditioning schemes are examined and compared to existing state-of-the-art methods, by conducting several numerical experiments on supercomputing infrastructures.

中文翻译:

分布式记忆系统两阶段多投影方法的设计

在超级计算基础设施上高效求解大型稀疏线性系统是各种仿真过程的耗时组件。一个有效的并行求解器应该满足要求的规范,包括收敛行为和可扩展性。为此,提出了一类基于上舒尔补法的两阶段代数域分解预处理方案,以利用多核处理器的适当分布式存储系统。该方法的设计一直集中在同构混合并行系统上,即分布式和共享内存系统。然而,所提出的方法也可以应用于异构系统,例如云基础设施,或带有加速器的混合并行系统,通过修改工作负载分配算法并考虑不同的网络延迟和带宽。所提议方案的第一阶段涉及在分布式系统的工作站之间分配子域,而第二阶段涉及将子域进一步重新分配到处理器的每个核心。所提出的方法利用基于半聚合子域的多投影技术,随着子域数量的增加,收敛行为得到改善。此外,还使用了子空间压缩技术,以提高预处理阶段的性能并降低所提出方案的内存需求。预处理方案与并行 Krylov 子空间方法相结合,即并行预处理 GMRES(m) 方法。
更新日期:2020-02-18
down
wechat
bug