当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Parallel performance of molecular dynamics trajectory analysis
Concurrency and Computation: Practice and Experience ( IF 1.5 ) Pub Date : 2020-04-27 , DOI: 10.1002/cpe.5789
Mahzad Khoshlessan 1 , Ioannis Paraskevakos 2 , Geoffrey C. Fox 3 , Shantenu Jha 2 , Oliver Beckstein 1, 4
Affiliation  

The performance of biomolecular molecular dynamics simulations has steadily increased on modern high‐performance computing resources but acceleration of the analysis of the output trajectories has lagged behind so that analyzing simulations is becoming a bottleneck. To close this gap, we studied the performance of trajectory analysis with message passing interface (MPI) parallelization and the Python MDAnalysis library on three different Extreme Science and Engineering Discovery Environment (XSEDE) supercomputers where trajectories were read from a Lustre parallel file system. Strong scaling performance was impeded by stragglers, MPI processes that were slower than the typical process. Stragglers were less prevalent for compute‐bound workloads, thus pointing to file reading as a bottleneck for scaling. However, a more complicated picture emerged in which both the computation and the data ingestion exhibited close to ideal strong scaling behavior whereas stragglers were primarily caused by either large MPI communication costs or long times to open the single shared trajectory file. We improved overall strong scaling performance by either subfiling (splitting the trajectory into separate files) or MPI‐IO with parallel HDF5 trajectory files. The parallel HDF5 approach resulted in near ideal strong scaling on up to 384 cores (16 nodes), thus reducing trajectory analysis times by two orders of magnitude compared with the serial approach.

中文翻译:

分子动力学轨迹分析的并行性能

在现代高性能计算资源上,生物分子分子动力学模拟的性能稳步提高,但输出轨迹分析的加速滞后,因此分析模拟正成为一个瓶颈。为了弥补这一差距,我们研究了在三个不同的极限科学与工程发现环境 (XSEDE) 超级计算机上使用消息传递接口 (MPI) 并行化和 Python MDAnalysis 库进行轨迹分析的性能,其中从 Lustre 并行文件系统读取轨迹。落后者、比典型过程慢的 MPI 过程阻碍了强大的扩展性能。落后者在计算密集型工作负载中不那么普遍,因此将文件读取视为扩展的瓶颈。然而,出现了一个更复杂的画面,其中计算和数据摄取都表现出接近理想的强缩放行为,而落后者主要是由大的 MPI 通信成本或长时间打开单个共享轨迹文件引起的。我们通过子文件(将轨迹拆分为单独的文件)或使用并行 HDF5 轨迹文件的 MPI-IO 提高了整体强大的缩放性能。并行 HDF5 方法在多达 384 个内核(16 个节点)上实现了近乎理想的强扩展,因此与串行方法相比,轨迹分析时间减少了两个数量级。
更新日期:2020-04-27
down
wechat
bug