当前位置:
X-MOL 学术
›
arXiv.cs.DC
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scalable and accurate multi-GPU based image reconstruction of large-scale ptychography data
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2021-06-14 , DOI: arxiv-2106.07575 Xiaodong Yu, Viktor Nikitin, Daniel J. Ching, Selin Aslan, Doga Gursoy, Tekin Bicer
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2021-06-14 , DOI: arxiv-2106.07575 Xiaodong Yu, Viktor Nikitin, Daniel J. Ching, Selin Aslan, Doga Gursoy, Tekin Bicer
While the advances in synchrotron light sources, together with the
development of focusing optics and detectors, allow nanoscale ptychographic
imaging of materials and biological specimens, the corresponding experiments
can yield terabyte-scale large volumes of data that can impose a heavy burden
on the computing platform. While Graphical Processing Units (GPUs) provide high
performance for such large-scale ptychography datasets, a single GPU is
typically insufficient for analysis and reconstruction. Several existing works
have considered leveraging multiple GPUs to accelerate the ptychographic
reconstruction. However, they utilize only Message Passing Interface (MPI) to
handle the communications between GPUs. It poses inefficiency for the
configuration that has multiple GPUs in a single node, especially while
processing a single large projection, since it provides no optimizations to
handle the heterogeneous GPU interconnections containing both low-speed links,
e.g., PCIe, and high-speed links, e.g., NVLink. In this paper, we provide a
multi-GPU implementation that can effectively solve large-scale ptychographic
reconstruction problem with optimized performance on intra-node multi-GPU. We
focus on the conventional maximum-likelihood reconstruction problem using
conjugate-gradient (CG) for the solution and propose a novel hybrid
parallelization model to address the performance bottlenecks in CG solver.
Accordingly, we develop a tool called PtyGer (Ptychographic GPU(multiple)-based
reconstruction), implementing our hybrid parallelization model design. The
comprehensive evaluation verifies that PtyGer can fully preserve the original
algorithm's accuracy while achieving outstanding intra-node GPU scalability.
中文翻译:
大规模 ptychography 数据的可扩展且准确的基于多 GPU 的图像重建
虽然同步加速器光源的进步,连同聚焦光学器件和探测器的发展,使材料和生物标本的纳米级 ptychographic 成像成为可能,但相应的实验可以产生 TB 级的大量数据,这会给计算平台带来沉重的负担. 虽然图形处理单元 (GPU) 为此类大规模 ptychography 数据集提供了高性能,但单个 GPU 通常不足以进行分析和重建。一些现有的工作已经考虑利用多个 GPU 来加速 ptychographic 重建。但是,它们仅使用消息传递接口 (MPI) 来处理 GPU 之间的通信。对于在单个节点中具有多个 GPU 的配置而言,它会导致效率低下,尤其是在处理单个大型投影时,因为它没有提供优化来处理包含低速链接(例如 PCIe)和高速链接(例如 NVLink)的异构 GPU 互连。在本文中,我们提供了一种多 GPU 实现,可以在节点内多 GPU 上以优化的性能有效解决大规模 ptychographic 重建问题。我们专注于使用共轭梯度 (CG) 作为解决方案的传统最大似然重建问题,并提出了一种新的混合并行化模型来解决 CG 求解器中的性能瓶颈。因此,我们开发了一个名为 PtyGer(基于 Ptychographic GPU(multiple)的重建)的工具,实现了我们的混合并行化模型设计。综合评估验证PtyGer可以完全保留原有算法'
更新日期:2021-06-15
中文翻译:
大规模 ptychography 数据的可扩展且准确的基于多 GPU 的图像重建
虽然同步加速器光源的进步,连同聚焦光学器件和探测器的发展,使材料和生物标本的纳米级 ptychographic 成像成为可能,但相应的实验可以产生 TB 级的大量数据,这会给计算平台带来沉重的负担. 虽然图形处理单元 (GPU) 为此类大规模 ptychography 数据集提供了高性能,但单个 GPU 通常不足以进行分析和重建。一些现有的工作已经考虑利用多个 GPU 来加速 ptychographic 重建。但是,它们仅使用消息传递接口 (MPI) 来处理 GPU 之间的通信。对于在单个节点中具有多个 GPU 的配置而言,它会导致效率低下,尤其是在处理单个大型投影时,因为它没有提供优化来处理包含低速链接(例如 PCIe)和高速链接(例如 NVLink)的异构 GPU 互连。在本文中,我们提供了一种多 GPU 实现,可以在节点内多 GPU 上以优化的性能有效解决大规模 ptychographic 重建问题。我们专注于使用共轭梯度 (CG) 作为解决方案的传统最大似然重建问题,并提出了一种新的混合并行化模型来解决 CG 求解器中的性能瓶颈。因此,我们开发了一个名为 PtyGer(基于 Ptychographic GPU(multiple)的重建)的工具,实现了我们的混合并行化模型设计。综合评估验证PtyGer可以完全保留原有算法'