Rendezvous algorithms for large-scale modeling and simulation,Journal of Parallel and Distributed Computing

当前位置： X-MOL 学术 › J. Parallel Distrib. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Rendezvous algorithms for large-scale modeling and simulation
Journal of Parallel and Distributed Computing ( IF 3.4 ) Pub Date : 2020-09-18 , DOI: 10.1016/j.jpdc.2020.09.001
Steven J. Plimpton , Christopher Knight

Rendezvous algorithms encode a communication pattern that is useful when processors sending data do not know who the receiving processors should be, or vice versa. The idea is to define an intermediate decomposition where datums from different sending processors can ”rendezvous” to perform a computation, in a manner that both the senders and eventual receivers of the results can identify the appropriate rendezvous processor.

Originally designed for interpolating between overlaid grids with independent parallel decompositions (Plimpton et al., 2004), we have recently found rendezvous algorithms useful for a variety of operations in particle- or grid-based simulation codes when running large problems on large numbers of processors. In particular, we show they can perform well when a load-balanced intermediate decomposition is randomized and not spatial, requiring all-to-all communication to move data between processors. In this case rendezvous algorithms leverage the large bisection communication bandwidths which parallel machines provide.

We describe how rendezvous algorithms work in a scientific computing context and give specific examples for molecular dynamics and Direct Simulation Monte Carlo codes which result in dramatic performance improvements versus simpler algorithms which do not scale as well. We explain how a generic rendezvous algorithm can be implemented, and also point out similarities with the MapReduce paradigm popularized by Google and Hadoop.

中文翻译：

用于大规模建模和仿真的交会算法

交会算法对通信模式进行编码，这在发送数据的处理器不知道接收处理器应该是谁的情况下很有用，反之亦然。想法是定义一个中间分解，其中来自不同发送处理器的数据可以“集合”以执行计算，结果的发送者和最终接收者都可以识别适当的集合处理器。

最初设计用于在具有独立并行分解的重叠网格之间进行插值（Plimpton等人，2004年），最近我们发现了集合点算法可用于在大量处理器上运行大问题时基于粒子或基于网格的仿真代码中的多种运算。特别是，我们表明，当负载均衡的中间分解是随机的而不是空间的时，它们需要良好的通信才能在处理器之间移动数据。在这种情况下，集合点算法会利用并行机提供的较大的对等通信带宽。

我们描述了集合算法在科学计算环境中的工作方式，并给出了分子动力学和直接模拟蒙特卡洛代码的特定示例，这些代码与不具有扩展性的简单算法相比，可显着提高性能。我们将说明如何实现通用集合点算法，并指出与Google和Hadoop普及的MapReduce范例的相似之处。

更新日期：2020-09-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11