当前位置: X-MOL 学术ACM Trans. Archit. Code Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GRAM
ACM Transactions on Architecture and Code Optimization ( IF 1.6 ) Pub Date : 2021-02-10 , DOI: 10.1145/3441830
Nhut-Minh Ho 1 , Himeshi De silva 1 , Weng-Fai Wong 1
Affiliation  

This article presents GRAM (<underline>G</underline>PU-based <underline>R</underline>untime <underline>A</underline>daption for <underline>M</underline>ixed-precision) a framework for the effective use of mixed precision arithmetic for CUDA programs. Our method provides a fine-grain tradeoff between output error and performance. It can create many variants that satisfy different accuracy requirements by assigning different groups of threads to different precision levels adaptively at runtime . To widen the range of applications that can benefit from its approximation, GRAM comes with an optional half-precision approximate math library. Using GRAM, we can trade off precision for any performance improvement of up to 540%, depending on the application and accuracy requirement.

中文翻译:

公克

本文介绍了 GRAM(<underline>G</underline>PU-based <underline>R</underline>untime <underline>A</underline>daption for <underline>M</underline>ixed-precision)一个框架在 CUDA 程序中有效使用混合精度算术。我们的方法在输出误差和性能之间提供了细粒度的权衡。它可以通过将不同的螺纹组分配到不同的精度级别来创建满足不同精度要求的许多变体在运行时自适应. 为了扩大可以从其近似中受益的应用范围,GRAM 提供了一个可选的半精度近似数学库。使用 GRAM,我们可以权衡精度以换取高达 540% 的性能提升,具体取决于应用和精度要求。
更新日期:2021-02-10
down
wechat
bug