当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bi-Objective Optimization of Data-Parallel Applications on Heterogeneous HPC Platforms for Performance and Energy Through Workload Distribution
IEEE Transactions on Parallel and Distributed Systems ( IF 5.6 ) Pub Date : 2020-09-29 , DOI: 10.1109/tpds.2020.3027338
Hamidreza Khaleghzadeh , Muhammad Fahad , Arsalan Shahid , Ravi Reddy Manumachu , Alexey Lastovetsky

Performance and energy are the two most important objectives for optimization on modern parallel platforms. In this article, we show that moving from single-objective optimization for performance or energy to their bi-objective optimization on heterogeneous processors results in a tremendous increase in the number of optimal solutions (workload distributions) even for the simple case of linear performance and energy profiles. We then study full performance and energy profiles of two real-life data-parallel applications and find that they exhibit shapes that are non-linear and complex enough to prevent good approximation of them as analytical functions for input to exact algorithms or optimization software for determining the Pareto front. We, therefore, propose a solution method solving the bi-objective optimization problem on heterogeneous processors. The method's novel component is an efficient and exact global optimization algorithm that takes as an input performance and energy profiles as arbitrary discrete functions of workload size, which accurately and realistically take into account resource contention and NUMA inherent in modern parallel platforms, and returns the Pareto-optimal solutions (generally speaking, load imbalanced). To construct the input discrete energy functions, the method employs a methodology that accurately models the energy consumption by a hybrid data-parallel application executing on a heterogeneous HPC platform containing different computing devices using system-level power measurements provided by power meters. We experimentally analyse the proposed solution method using three data-parallel applications, matrix multiplication, 2D fast Fourier transform (2D-FFT), and gene sequencing, on two connected heterogeneous servers consisting of multicore CPUs, GPUs, and Intel Xeon Phi. We show that it determines a superior Pareto front containing the best load balanced solutions and all the load imbalanced solutions that are ignored by load balancing methods.

中文翻译:


通过工作负载分配,异构 HPC 平台上的数据并行应用程序的性能和能源双目标优化



性能和能耗是现代并行平台优化的两个最重要的目标。在本文中,我们表明,即使对于线性性能和能耗的简单情况,从性能或能耗的单目标优化转向异构处理器上的双目标优化也会导致最佳解决方案(工作负载分布)数量的巨大增加。能源概况。然后,我们研究了两个现实生活数据并行应用程序的完整性能和能量分布,发现它们表现出非线性和复杂的形状,足以阻止它们作为分析函数的良好近似,用于输入精确的算法或优化软件来确定帕累托前沿。因此,我们提出了一种解决异构处理器上双目标优化问题的解决方法。该方法的新颖组件是一种高效且精确的全局优化算法,该算法将性能和能量分布作为工作负载大小的任意离散函数作为输入,准确而现实地考虑现代并行平台固有的资源争用和 NUMA,并返回 Pareto - 最优解决方案(一般来说,负载不平衡)。为了构建输入离散能量函数,该方法采用了一种方法,该方法使用功率计提供的系统级功率测量来对在包含不同计算设备的异构 HPC 平台上执行的混合数据并行应用程序的能耗进行精确建模。 我们在两台由多核 CPU、GPU 和 Intel Xeon Phi 组成的连接异构服务器上使用矩阵乘法、2D 快速傅里叶变换 (2D-FFT) 和基因测序这三种数据并行应用程序对所提出的解决方案进行实验分析。我们证明,它确定了一个优越的帕累托前沿,其中包含最佳负载平衡解决方案和负载平衡方法忽略的所有负载不平衡解决方案。
更新日期:2020-09-29
down
wechat
bug