当前位置: X-MOL 学术Int. J. Parallel. Program › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Resource-Aware Data Parallel Array Processing
International Journal of Parallel Programming ( IF 0.9 ) Pub Date : 2020-06-09 , DOI: 10.1007/s10766-020-00664-0
Clemens Grelck , Cédric Blom

Malleable applications may run with varying numbers of threads, and thus on varying numbers of cores, while the precise number of threads is irrelevant for the program logic. Malleability is a common property in data-parallel array processing. With ever growing core counts we are increasingly faced with the problem of how to choose the best number of threads. We propose a compiler-directed, almost automatic tuning approach for the functional array processing language SaC . Our approach consists of an offline training phase during which compiler-instrumented application code systematically explores the design space and accumulates a persistent database of profiling data. When generating production code our compiler consults this database and augments each data-parallel operation with a recommendation table. Based on these recommendation tables the runtime system chooses the number of threads individually for each data-parallel operation. With energy/power efficiency becoming an ever greater concern, we explicitly distinguish between two application scenarios: aiming at best possible performance or aiming at a beneficial trade-off between performance and resource investment.

中文翻译:

资源感知数据并行阵列处理

可延展的应用程序可以使用不同数量的线程运行,因此在不同数量的内核上运行,而线程的精确数量与程序逻辑无关。延展性是数据并行数组处理中的一个常见属性。随着内核数量的不断增加,我们越来越面临如何选择最佳线程数的问题。我们为函数式数组处理语言 SaC 提出了一种编译器导向的、几乎自动的调整方法。我们的方法包括离线训练阶段,在此期间,编译器检测的应用程序代码系统地探索设计空间并积累分析数据的持久数据库。在生成生产代码时,我们的编译器会参考该数据库,并使用推荐表扩充每个数据并行操作。基于这些推荐表,运行时系统为每个数据并行操作单独选择线程数。随着能源/功率效率越来越受到关注,我们明确区分了两种应用场景:旨在实现最佳性能或旨在实现性能和资源投资之间的有利权衡。
更新日期:2020-06-09
down
wechat
bug