Low learning-cost offline strategies for EDP optimization of parallel applications,Journal of Systems Architecture

当前位置： X-MOL 学术 › J. Syst. Archit. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Low learning-cost offline strategies for EDP optimization of parallel applications
Journal of Systems Architecture ( IF 3.7 ) Pub Date : 2020-12-02 , DOI: 10.1016/j.sysarc.2020.101959
Gustavo Paim Berned , Fábio D. Rossi , Marcelo C. Luizelli , Samuel Xavier de Souza , Antonio Carlos S. Beck , Arthur F. Lorenzon

Many parallel applications do not scale with the number of threads. Several online and offline strategies have been proposed in order to optimize this number. While the former strategy can capture some behaviors that can only be known at runtime, the latter do not impose any execution overhead and can use more complex and efficient algorithms. However, the learning algorithm in these offline strategies may take several hours, precluding their use or a smooth portability across different systems. In this scenario, we propose a methodology to decrease the learning time of offline strategies by inferring the execution behavior of parallel applications using smaller input sets than the ones used by the target applications. It implements two search strategies: SEA, where all parallel regions of an application run with the same number of threads; and SPRA, which seeks to find an ideal number of threads for each parallel region of a given application. With an extensive set of experiments, we show that SEA and SPRA strategies converge to results close to an offline approach applied over the regular input, but being 88% and 87% faster, on average, respectively. We also show that SPRA is better than SEA for unbalanced applications.

中文翻译：

用于并行应用程序EDP优化的低学习成本离线策略

许多并行应用程序不随线程数量扩展。为了优化这个数字，已经提出了几种在线和离线策略。尽管前者的策略可以捕获某些只能在运行时知道的行为，但后者不会增加任何执行开销，并且可以使用更复杂，更有效的算法。但是，这些离线策略中的学习算法可能要花费几个小时，这会阻止它们的使用或在不同系统之间的平稳可移植性。在这种情况下，我们提出了一种方法，该方法通过使用比目标应用程序使用的输入集小的输入集来推断并行应用程序的执行行为，从而减少离线策略的学习时间。它实现了两种搜索策略：SEA，其中应用程序的所有并行区域都以相同数量的线程运行；SPRA，它试图为给定应用程序的每个并行区域找到理想的线程数。通过广泛的实验，我们显示SEA和SPRA策略收敛到接近于在常规输入上应用的脱机方法的结果，但分别平均快了88％和87％。我们还表明，对于不平衡的应用程序，SPRA比SEA更好。

更新日期：2020-12-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11