当前位置: X-MOL 学术Comput. Phys. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Algorithm for replica redistribution in an implementation of the population annealing method on a hybrid supercomputer architecture
Computer Physics Communications ( IF 6.3 ) Pub Date : 2021-04-01 , DOI: 10.1016/j.cpc.2020.107786
Alexander Russkov , Roman Chulkevich , Lev N. Shchur

The parallel annealing method is one of the promising approaches for large scale simulations as potentially scalable on any parallel architecture. We present an implementation of the algorithm on the hybrid program architecture combining CUDA and MPI. The problem is to keep all general-purpose graphics processing unit devices as busy as possible redistributing replicas and to do that efficiently. We provide details of the testing on Intel Skylake/Nvidia V100 based hardware running in parallel more than two million replicas of the Ising model sample. The results are quite optimistic because the acceleration grows toward the perfect line with the growing complexity of the simulated system.

中文翻译:

在混合超级计算机体系结构上实现种群退火方法的副本再分配算法

并行退火方法是大规模模拟的一种很有前途的方法,因为它在任何并行架构上都具有潜在的可扩展性。我们在结合 CUDA 和 MPI 的混合程序架构上介绍了该算法的实现。问题是让所有通用图形处理单元设备尽可能忙于重新分配副本,并有效地做到这一点。我们提供了在基于 Intel Skylake/Nvidia V100 的硬件上并行运行超过 200 万个 Ising 模型样本副本的测试的详细信息。结果非常乐观,因为随着模拟系统的复杂性不断增加,加速度会朝着完美的直线增长。
更新日期:2021-04-01
down
wechat
bug