当前位置: X-MOL 学术Sci. Program. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Parallel Ghosting Algorithm for The Flexible Distributed Mesh Database
Scientific Programming Pub Date : 2013 , DOI: 10.3233/spr-130361
Misbah Mubarak, Seegyoung Seol, Qiukai Lu, Mark S. Shephard

Critical to the scalability of parallel adaptive simulations are parallel control functions including load balancing, reduced inter-process communication and optimal data decomposition. In distributed meshes, many mesh-based applications frequently access neighborhood information for computational purposes which must be transmitted efficiently to avoid parallel performance degradation when the neighbors are on different processors. This article presents a parallel algorithm of creating and deleting data copies, referred to as ghost copies, which localize neighborhood data for computation purposes while minimizing inter-process communication. The key characteristics of the algorithm are: (1) It can create ghost copies of any permissible topological order in a 1D, 2D or 3D mesh based on selected adjacencies. (2) It exploits neighborhood communication patterns during the ghost creation process thus eliminating all-to-all communication. (3) For applications that need neighbors of neighbors, the algorithm can create n number of ghost layers up to a point where the whole partitioned mesh can be ghosted. Strong and weak scaling results are presented for the IBM BG/P and Cray XE6 architectures up to a core count of 32,768 processors. The algorithm also leads to scalable results when used in a parallel super-convergent patch recovery error estimator, an application that frequently accesses neighborhood data to carry out computation.

中文翻译:

柔性分布式网格数据库的并行重影算法

并行自适应功能的关键在于并行控制功能,这些功能包括负载平衡,减少的进程间通信和最佳的数据分解。在分布式网格中,出于计算目的,许多基于网格的应用程序经常访问邻域信息,这些邻域信息必须有效传输,以避免邻居在不同处理器上时并行性能下降。本文介绍了一种创建和删除数据副本(称为幻影副本)的并行算法,该算法可将邻域数据本地化以进行计算,同时最大程度地减少进程间的通信。该算法的主要特征是:(1)它可以根据选定的邻接关系在1D,2D或3D网格中创建任何允许的拓扑顺序的幻影副本。(2)它在重影创建过程中利用了邻域通信模式,从而消除了所有人之间的通信。(3)对于需要邻居邻居的应用程序,算法可以创建n个重影层数,直到可以划分整个分区网格的程度。对于IBM BG / P和Cray XE6体系结构,最多可提供32,768个处理器的核心扩展结果。当在并行超收敛补丁恢复误差估计器中使用该算法时,该算法还导致可伸缩的结果,该应用程序经常访问邻域数据以进行计算。
更新日期:2020-09-25
down
wechat
bug