当前位置:
X-MOL 学术
›
arXiv.cs.MS
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Porting a sparse linear algebra math library to Intel GPUs
arXiv - CS - Mathematical Software Pub Date : 2021-03-18 , DOI: arxiv-2103.10116 Yuhsiang M. Tsai, Terry Cojean, Hartwig Anzt
arXiv - CS - Mathematical Software Pub Date : 2021-03-18 , DOI: arxiv-2103.10116 Yuhsiang M. Tsai, Terry Cojean, Hartwig Anzt
With the announcement that the Aurora Supercomputer will be composed of
general purpose Intel CPUs complemented by discrete high performance Intel
GPUs, and the deployment of the oneAPI ecosystem, Intel has committed to enter
the arena of discrete high performance GPUs. A central requirement for the
scientific computing community is the availability of production-ready software
stacks and a glimpse of the performance they can expect to see on Intel high
performance GPUs. In this paper, we present the first platform-portable open
source math library supporting Intel GPUs via the DPC++ programming
environment. We also benchmark some of the developed sparse linear algebra
functionality on different Intel GPUs to assess the efficiency of the DPC++
programming ecosystem to translate raw performance into application
performance. Aside from quantifying the efficiency within the hardware-specific
roofline model, we also compare against routines providing the same
functionality that ship with Intel's oneMKL vendor library.
中文翻译:
将稀疏线性代数数学库移植到Intel GPU
宣布Aurora超级计算机将由通用的Intel CPU和离散的高性能Intel GPU补充后,以及oneAPI生态系统的部署,Intel致力于进入离散高性能GPU的领域。科学计算社区的核心要求是可生产的软件堆栈的可用性以及对他们期望在英特尔高性能GPU上看到的性能的一瞥。在本文中,我们介绍了第一个通过DPC ++编程环境支持Intel GPU的平台可移植的开源数学库。我们还在不同的Intel GPU上对一些已开发的稀疏线性代数功能进行了基准测试,以评估DPC ++编程生态系统将原始性能转换为应用程序性能的效率。
更新日期:2021-03-19
中文翻译:
将稀疏线性代数数学库移植到Intel GPU
宣布Aurora超级计算机将由通用的Intel CPU和离散的高性能Intel GPU补充后,以及oneAPI生态系统的部署,Intel致力于进入离散高性能GPU的领域。科学计算社区的核心要求是可生产的软件堆栈的可用性以及对他们期望在英特尔高性能GPU上看到的性能的一瞥。在本文中,我们介绍了第一个通过DPC ++编程环境支持Intel GPU的平台可移植的开源数学库。我们还在不同的Intel GPU上对一些已开发的稀疏线性代数功能进行了基准测试,以评估DPC ++编程生态系统将原始性能转换为应用程序性能的效率。