当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Computing integrals for electron molecule scattering on heterogeneous accelerator systems
Concurrency and Computation: Practice and Experience ( IF 1.5 ) Pub Date : 2020-09-24 , DOI: 10.1002/cpe.5984
Charles J. Gillan 1 , Ivor Spence 1
Affiliation  

Using heterogeneous accelerators to obtain high performance for mathematical kernels remains an active research frontier in computational science. The accelerators have compute architectures that are different from the CPUs and in addition have memory spaces independent of the CPU systems to which they are connected. It follows that accelerators require a different approach to writing optimal code than that needed on a multi‐CPU system. Taken together these issues have represented a significant barrier to widespread adoption of accelerators for execution with large legacy code bases. OpenCL has emerged as a common programming language with which to implement code that runs across a range of parallel architectures, including multi‐core CPUs. This article is a case study on how the instruction‐level parallelism offered by field programmable gate arrays (FPGAs) and GPUs through OpenCL can be exploited in molecular physics. The algorithm which we study is the evaluation of tail integrals between Gaussian type basis functions for the R‐matrix method, a task that arises in the study of scattering of low energy electrons by molecular targets. The results of our productivity study, which is the first application of OpenCL in this problem domain, show that significant performance can be obtained from both FPGA and graphics processing unit (GPU) accelerators for this application. We discuss suitable transformations unique to each accelerator architecture for the integrals studied and present performance results comparing the FPGA and GPU with execution on Intel multi‐core systems.

中文翻译:

计算在异质加速器系统上电子分子散射的积分

使用异构加速器来获得数学内核的高性能仍然是计算科学领域的活跃研究领域。加速器具有与CPU不同的计算体系结构,此外还具有独立于它们所连接的CPU系统的内存空间。因此,与多CPU系统上所需的加速器相比,加速器需要采用不同的方法来编写最佳代码。这些问题共同构成了广泛采用加速器以执行具有较大遗留代码库的执行的重大障碍。OpenCL已经成为一种常见的编程语言,可以用来实现在多种并行体系结构(包括多核CPU)中运行的代码。本文是一个案例研究,说明如何在分子物理中利用现场可编程门阵列(FPGA)和GPU通过OpenCL提供的指令级并行性。我们研究的算法是对R矩阵方法进行高斯型基函数之间尾部积分的评估,这是研究分子靶散射低能电子时出现的一项任务。我们的生产力研究的结果(这是OpenCL在此问题领域中的第一个应用)表明,可以从针对该应用的FPGA和图形处理单元(GPU)加速器获得显着性能。我们讨论了每种加速器体系结构独特的适用于所研究积分的转换,并比较了FPGA和GPU以及在英特尔多核系统上的执行情况,给出了性能结果。
更新日期:2020-09-24
down
wechat
bug