当前位置: X-MOL 学术Form. Methods Syst. Des. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Static detection of uncoalesced accesses in GPU programs
Formal Methods in System Design ( IF 0.8 ) Pub Date : 2021-03-05 , DOI: 10.1007/s10703-021-00362-8
Rajeev Alur , Joseph Devietti , Omar S. Navarro Leija , Nimit Singhania

GPU programming has become popular due to the high computational capabilities of GPUs. Obtaining significant performance gains with GPU is however challenging and the programmer needs to be aware of various subtleties of the GPU architecture. One such subtlety lies in accessing GPU memory, where certain access patterns can lead to poor performance. Such access patterns are referred to as uncoalesced global memory accesses. This work presents a light-weight compile-time static analysis to identify such accesses in GPU programs. The analysis relies on a novel abstraction which tracks the access pattern across multiple threads. The abstraction enables quick prediction while providing correctness guarantees. We have implemented the analysis in LLVM and compare it against a dynamic analysis implementation. The static analysis identifies 95 pre-existing uncoalesced accesses in Rodinia, a popular benchmark suite of GPU programs, and finishes within seconds for most programs, in comparison to the dynamic analysis which finds 69 accesses and takes orders of magnitude longer to finish.



中文翻译:

静态检测GPU程序中的非强制访问

由于GPU的高计算能力,GPU编程已变得流行。然而,使用GPU获得显着的性能提升是一项挑战,程序员需要意识到GPU架构的各种细微差别。这样的微妙之处在于访问GPU内存,其中某些访问模式可能会导致性能下降。此类访问模式称为非预言全局内存访问。这项工作提出了轻量级的编译时静态分析,以识别GPU程序中的此类访问。该分析依赖于一种新颖的抽象,该抽象跟踪跨多个线程的访问模式。通过抽象,可以快速进行预测,同时提供正确性保证。我们已经在LLVM中实现了分析,并将其与动态分析实现进行了比较。静态分析可识别Rodinia(一种流行的GPU程序基准套件)中的95个预先存在的未经授权的访问,并且与大多数程序相比,动态分析可发现69个访问且完成需要几个数量级,而动态分析则可在几秒钟内完成。

更新日期:2021-03-05
down
wechat
bug