当前位置: X-MOL 学术Int. J. High Perform. Comput. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CFD code adaptation to the FPGA architecture
The International Journal of High Performance Computing Applications ( IF 3.1 ) Pub Date : 2020-11-10 , DOI: 10.1177/1094342020972461
Krzysztof Rojek , Kamil Halbiniak , Lukasz Kuczynski 1
Affiliation  

For the last years, we observe the intensive development of accelerated computing platforms. Although current trends indicate a well-established position of GPU devices in the HPC environment, FPGA (Field-Programmable Gate Array) aspires to be an alternative solution to offload the CPU computation. This paper presents a systematic adaptation of four various CFD (Computational Fluids Dynamic) kernels to the Xilinx Alveo U250 FPGA. The goal of this paper is to investigate the potential of the FPGA architecture as the future infrastructure able to provide the most complex numerical simulations in the area of fluid flow modeling. The selected kernels are customized to a real-scientific scenario, compatible with the EULAG (Eulerian/semi-Lagrangian) fluid solver. The solver is used to simulate thermo-fluid flows across a wide range of scales and is extensively used in numerical weather prediction. The proposed adaptation is focused on the analysis of the strengths and weaknesses of the FPGA accelerator, considering performance and energy efficiency. The proposed adaptation is compared with a CPU implementation that was strongly optimized to provide realistic and objective benchmarks. The performance results are compared with a set of server CPUs containing various Intel generations, including Intel SkyLake-based CPUs as Xeon Gold 6148 and Xeon Platinum 8168, as well as Intel Xeon E5-2695 CPU based on the IvyBridge architecture. Since all the kernels belong to the group of memory-bound algorithms, our main challenge is to saturate global memory bandwidth and provide data locality with the intensive BRAM (Block RAM) reusing. Our adaptation allows us to reduce the performance per watt up to 80% compared to the CPUs.

中文翻译:

CFD 代码适配 FPGA 架构

在过去的几年里,我们观察到加速计算平台的密集发展。尽管当前的趋势表明 GPU 设备在 HPC 环境中的地位稳固,但 FPGA(现场可编程门阵列)渴望成为卸载 CPU 计算的替代解决方案。本文介绍了对 Xilinx Alveo U250 FPGA 的四种不同 CFD(计算流体动态)内核的系统适配。本文的目标是研究 FPGA 架构作为未来基础设施的潜力,能够在流体流动建模领域提供最复杂的数值模拟。所选内核针对真实的科学场景进行定制,与 EULAG(欧拉/半拉格朗日)流体求解器兼容。该求解器用于模拟各种尺度的热流体流动,并广泛用于数值天气预报。建议的改编侧重于分析 FPGA 加速器的优势和劣势,同时考虑性能和能源效率。建议的适应与 CPU 实现进行了比较,该实现经过强烈优化以提供现实和客观的基准。将性能结果与一组包含不同 Intel 代的服务器 CPU 进行比较,包括基于 Intel SkyLake 的 CPU,如 Xeon Gold 6148 和 Xeon Platinum 8168,以及基于 IvyBridge 架构的 Intel Xeon E5-2695 CPU。由于所有内核都属于内存绑定算法组,我们的主要挑战是使全局内存带宽饱和,并通过密集的 BRAM(块 RAM)重用提供数据局部性。与 CPU 相比,我们的调整使我们能够将每瓦性能降低多达 80%。
更新日期:2020-11-10
down
wechat
bug