当前位置: X-MOL 学术Int. J. Parallel. Program › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards High-Performance Code Generation for Multi-GPU Clusters Based on a Domain-Specific Language for Algorithmic Skeletons
International Journal of Parallel Programming ( IF 1.5 ) Pub Date : 2020-05-22 , DOI: 10.1007/s10766-020-00659-x
Fabian Wrede , Herbert Kuchen

In earlier work, we defined a domain-specific language (DSL) with the aim to provide an easy-to-use approach for programming multi-core and multi-GPU clusters. The DSL incorporates the idea of utilizing algorithmic skeletons, which are well-known patterns for parallel programming, such as map and reduce. Based on the chosen skeleton, a user-defined function can be applied to a data structure in parallel with the main advantage that the user does not have to worry about implementation details. So far, we had only implemented a generator for multi-core clusters and in this paper we present and evaluate two prototypes of generators for multi-GPU clusters, which are based on OpenACC and CUDA. We have evaluated the approach with four benchmark applications. The results show that the generation approach leads to execution times, which are on par with an alternative library implementation.

中文翻译:

基于领域特定语言的算法骨架多 GPU 集群的高性能代码生成

在早期的工作中,我们定义了一种领域特定语言 (DSL),旨在为多核和多 GPU 集群的编程提供一种易于使用的方法。DSL 结合了利用算法骨架的思想,这是众所周知的并行编程模式,例如 map 和 reduce。基于选择的骨架,用户定义的函数可以并行应用于数据结构,主要优点是用户不必担心实现细节。到目前为止,我们只为多核集群实现了一个生成器,在本文中,我们展示并评估了两个基于 OpenACC 和 CUDA 的多 GPU 集群生成器原型。我们已经使用四个基准应用程序评估了该方法。结果表明,生成方法会导致执行时间,
更新日期:2020-05-22
down
wechat
bug