当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bi-objective Optimization of Data-parallel Applications on Heterogeneous HPC Platforms for Performance and Energy through Workload Distribution
IEEE Transactions on Parallel and Distributed Systems ( IF 5.3 ) Pub Date : 2021-03-01 , DOI: 10.1109/tpds.2020.3027338
Hamidreza Khaleghzadeh , Muhammad Fahad , Arsalan Shahid , Ravi Reddy Manumachu , Alexey Lastovetsky

Performance and energy are the two most important objectives for optimization on modern parallel platforms. In this article, we show that moving from single-objective optimization for performance or energy to their bi-objective optimization on heterogeneous processors results in a tremendous increase in the number of optimal solutions (workload distributions) even for the simple case of linear performance and energy profiles. We then study full performance and energy profiles of two real-life data-parallel applications and find that they exhibit shapes that are non-linear and complex enough to prevent good approximation of them as analytical functions for input to exact algorithms or optimization software for determining the Pareto front. We, therefore, propose a solution method solving the bi-objective optimization problem on heterogeneous processors. The method's novel component is an efficient and exact global optimization algorithm that takes as an input performance and energy profiles as arbitrary discrete functions of workload size, which accurately and realistically take into account resource contention and NUMA inherent in modern parallel platforms, and returns the Pareto-optimal solutions (generally speaking, load imbalanced). To construct the input discrete energy functions, the method employs a methodology that accurately models the energy consumption by a hybrid data-parallel application executing on a heterogeneous HPC platform containing different computing devices using system-level power measurements provided by power meters. We experimentally analyse the proposed solution method using three data-parallel applications, matrix multiplication, 2D fast Fourier transform (2D-FFT), and gene sequencing, on two connected heterogeneous servers consisting of multicore CPUs, GPUs, and Intel Xeon Phi. We show that it determines a superior Pareto front containing the best load balanced solutions and all the load imbalanced solutions that are ignored by load balancing methods.

中文翻译:

通过工作负载分布对异构 HPC 平台上的数据并行应用进行双目标优化,以实现性能和能源

性能和能源是现代并行平台上优化的两个最重要的目标。在本文中,我们展示了从性能或能量的单目标优化转向异构处理器上的双目标优化会导致优化解决方案(工作负载分布)的数量大幅增加,即使对于线性性能和能源概况。然后,我们研究了两个现实生活中数据并行应用程序的完整性能和能量分布,发现它们表现出的非线性和复杂的形状足以阻止它们作为分析函数的良好近似,用于精确算法或优化软件的输入,以确定帕累托前沿。因此,我们 提出一种求解异构处理器上的双目标优化问题的求解方法。该方法的新颖组件是一种高效且精确的全局优化算法,它将输入性能和能量分布作为工作负载大小的任意离散函数,准确而现实地考虑了现代并行平台中固有的资源争用和 NUMA,并返回帕累托- 最佳解决方案(一般来说,负载不平衡)。为了构建输入离散能量函数,该方法采用了一种方法,该方法通过使用功率计提供的系统级功率测量值对在包含不同计算设备的异构 HPC 平台上执行的混合数据并行应用程序的能耗进行精确建模。我们使用三个数据并行应用程序、矩阵乘法、2D 快速傅立叶变换 (2D-FFT) 和基因测序,在由多核 CPU、GPU 和英特尔至强融核组成的两个连接的异构服务器上实验性地分析了所提出的解决方案方法。我们表明它确定了一个优越的帕累托前沿,其中包含最佳负载平衡解决方案和所有负载平衡方法忽略的负载不平衡解决方案。
更新日期:2021-03-01
down
wechat
bug