当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An in‐depth introduction of multi‐workgroup tiling for improving the locality of explicit one‐step methods for ODE systems with limited access distance on GPUs
Concurrency and Computation: Practice and Experience ( IF 1.5 ) Pub Date : 2020-09-22 , DOI: 10.1002/cpe.6016
Matthias Korch 1 , Tim Werner 1
Affiliation  

This article considers a locality optimization technique for the parallel solution of a special class of large systems of ordinary differential equations (ODEs) by explicit one‐step methods on GPUs. This technique is based on tiling across the stages of the one‐step method and is enabled by the special structure of the class of ODE systems considered, that is, the limited access distance. The focus of this article is on increasing the range of access distances for which the tiling technique can provide a speedup by joining the memory resources and the computational power of multiple workgroups for the computation of one tile (multi‐workgroup tiling). In particular, this article provides an extended in‐depth introduction and discussion of the multi‐workgroup tiling technique and its theoretical and technical foundations together with a new tuning option (mapping stride) and new experiments. The experiments performed show speedups of the multi‐workgroup tiling technique compared with traditional single‐workgroup tiling for two different Runge–Kutta methods on NVIDIAs Kepler and Volta architectures.

中文翻译:

深入介绍了多工作组拼贴,以改善在GPU上访问距离有限的ODE系统的显式一步方法的局部性

本文考虑了通过GPU上的显式单步方法对一类特殊的大型常微分方程(ODE)系统进行并行求解的局部性优化技术。该技术基于跨一步方法的各个阶段,并通过所考虑的ODE系统类的特殊结构(即有限的访问距离)来启用。本文的重点是通过将内存资源和多个工作组的计算能力结合在一起以计算一个图块(多工作组平铺),从而扩大平铺技术可以提供加速的访问距离范围)。特别是,本文提供了对多工作组平铺技术及其理论和技术基础的扩展,深入的介绍和讨论,以及新的调整选项(映射步幅)和新实验。进行的实验表明,针对NVIDIA Kepler和Volta架构的两种不同的Runge-Kutta方法,与传统的单工作组平铺相比,多工作组平铺技术的提速。
更新日期:2020-09-22
down
wechat
bug