当前位置: X-MOL 学术Int. J. High Perform. Comput. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accelerated execution via eager-release of dependencies in task-based workflows
The International Journal of High Performance Computing Applications ( IF 3.5 ) Pub Date : 2021-03-03 , DOI: 10.1177/1094342021997558
Hatem Elshazly 1 , Francesc Lordan 1 , Jorge Ejarque 1 , Rosa M. Badia 1
Affiliation  

Task-based programming models offer a flexible way to express the unstructured parallelism patterns of nowadays complex applications. This expressive capability is required to achieve maximum possible performance for applications that are executed in distributed execution platforms. In current task-based workflows, tasks are launched for execution when their data dependencies are satisfied. However, even though the data dependencies of a certain task might have been already produced, the execution of this task will be delayed until its predecessor tasks completely finish their execution. As a consequence of this approach of releasing dependencies, the amount of parallelism inherent in applications is limited and performance improvement opportunities are wasted. To mitigate this limitation, we propose an eager approach for releasing data dependencies. Following this approach, the execution of tasks will not be delayed until their predecessor tasks completely finish their execution, instead, tasks will be launched for execution as soon as their data requirements are available. Hence, more parallelism is exposed and applications can achieve higher levels of performance by overlapping the execution of tasks. Towards achieving this goal, in this paper we propose applying two changes to task-based workflow systems. First, modifying the dependency relationships of tasks to be specified not only in terms of predecessor and successor tasks but also in terms of the data that caused these dependencies. Second, triggering the release of dependencies as soon as a predecessor task generates the output data instead of having to wait until the end of the predecessor execution to release all of its dependencies. We realize this proposal using PyCOMPSs: a task-based programming model for parallelizing Python applications. Our experiments show that using an eager approach for releasing dependencies achieves more than 50% performance improvement in the total execution time as compared to the default approach of releasing dependencies.



中文翻译:

通过急于释放基于任务的工作流程中的依赖关系来加快执行速度

基于任务的编程模型提供了一种灵活的方式来表达当今复杂应用程序的非结构化并行模式。为了在分布式执行平台上执行的应用程序获得最大可能的性能,需要这种表达能力。在当前的基于任务的工作流中,当满足任务的数据依赖性时,将启动任务以执行。但是,即使可能已经生成了某个任务的数据相关性,但该任务的执行将被延迟,直到其先前的任务完全完成其执行为止。这种释放依赖关系的方法的结果是,应用程序固有的并行性数量受到限制,并且浪费了性能改进的机会。为了减轻这种限制,我们提出了一种释放数据依赖关系的急切方法。按照这种方法,任务的执行将不会延迟,直到它们的前任任务完全完成执行为止,相反,只要有数据需求,就会启动任务以执行。因此,公开了更多的并行性,并且应用程序可以通过重叠任务的执行来达到更高的性能水平。为了实现这一目标,在本文中,我们建议对基于任务的工作流系统进行两项更改。首先,不仅要根据前任和后继任务,而且要根据引起这些依赖关系的数据来修改要指定的任务的依赖关系。第二,在前任任务生成输出数据后立即触发依赖关系的释放,而不必等到前任执行结束后才释放其所有依赖关系。我们使用PyCOMPSs实现了这个建议:PyCOMPSs是用于并行化Python应用程序的基于任务的编程模型。我们的实验表明,与默认的释放依赖项方法相比,使用急切的释放依赖项方法可以将总执行时间提高50%以上的性能。

更新日期:2021-03-25
down
wechat
bug