当前位置: X-MOL 学术J. Parallel Distrib. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CHAMELEON: Reactive load balancing for hybrid MPI+openMP task-parallel applications
Journal of Parallel and Distributed Computing ( IF 3.4 ) Pub Date : 2019-12-16 , DOI: 10.1016/j.jpdc.2019.12.005
Jannis Klinkenberg , Philipp Samfass , Michael Bader , Christian Terboven , Matthias S. Müller

Many applications in high performance computing are designed based on underlying performance and execution models. While these models could successfully be employed in the past for balancing load within and between compute nodes, modern software and hardware increasingly make performance predictability difficult if not impossible. Consequently, balancing computational load becomes much more difficult. Aiming to tackle these challenges in search for a general solution, we present a novel library for fine-granular task-based reactive load balancing in distributed memory based on MPI and OpenMP. With our approach, individual migratable tasks can be executed on any MPI rank. The actual executing rank is determined at run time based on online performance data. We evaluate our approach under an enforced power cap and under enforced clock frequency changes for a synthetic benchmark and show its robustness for work-induced imbalances for a realistic application. Our experiments demonstrate speedups of up to 1.31X.



中文翻译:

CHAMELEON:混合MPI + openMP任务并行应用程序的无功负载平衡

高性能计算中的许多应用程序都是基于基础性能和执行模型而设计的。虽然过去可以成功地使用这些模型来平衡计算节点内部和之间的负载,但是现代软件和硬件使性能可预测性越来越困难,即使不是不可能。因此,平衡计算负荷变得更加困难。为了解决寻求通用解决方案的这些挑战,我们提出了一个新颖的库,用于基于MPI和OpenMP的分布式内存中基于细粒度任务的无功负载平衡。使用我们的方法,可以在任何MPI等级上执行单个可迁移任务。实际执行排名是在运行时根据在线性能数据确定的。我们评估了在强制功率上限和强制时钟频率变化下的综合基准方法,并显示了其在实际应用中对因工作引起的不平衡的鲁棒性。我们的实验表明,加速比达到了1个31X

更新日期:2020-01-04
down
wechat
bug