当前位置: X-MOL 学术Comput. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Intra-Tile Parallelization for Two-Level Perfectly Nested Loops With Non-Uniform Dependences
The Computer Journal ( IF 1.4 ) Pub Date : 2020-05-28 , DOI: 10.1093/comjnl/bxaa050
Zahra Abdi Reyhan 1 , Shahriar Lotfi 1 , Ayaz Isazadeh 1 , Jaber Karimpour 1
Affiliation  

Most important scientific and engineering applications have complex computations or large data. In all these applications, a huge amount of time is consumed by nested loops. Therefore, loops are the main source of the parallelization of scientific and engineering programs. Many parallelizing compilers focus on parallelization of nested loops with uniform dependences, and parallelization of nested loops with non-uniform dependences has not been extensively investigated. This paper addresses the problem of parallelizing two-level nested loops with non-uniform dependences. The aim is to minimize the execution time by improving the load balancing and minimizing the inter-processor communication. We propose a new tiling algorithm, k-StepIntraTiling, using bin packing problem to minimize the execution time. We demonstrate the effectiveness of the proposed method in several experiments. Simulation and experimental results show that the algorithm effectively reduces the total execution time of several benchmarks compared to the other tiling methods.

中文翻译:

具有非一致相关性的两层完美嵌套循环的层内并行化

最重要的科学和工程应用程序具有复杂的计算或大数据。在所有这些应用程序中,嵌套循环会消耗大量时间。因此,循环是科学和工程程序并行化的主要来源。许多并行化编译器集中于具有统一依赖性的嵌套循环的并行化,并且尚未广泛研究具有非均匀依赖性的嵌套循环的并行化。本文解决了使具有非均匀依赖性的两级嵌套循环并行化的问题。目的是通过改善负载平衡和最小化处理器间通信来最小化执行时间。我们提出了一种新的切片算法k-StepIntraTiling,它使用bin打包问题来最大程度地减少执行时间。我们在几个实验中证明了该方法的有效性。仿真和实验结果表明,与其他平铺方法相比,该算法有效减少了多个基准的总执行时间。
更新日期:2020-05-28
down
wechat
bug