Improving Accuracy for Matrix Multiplications on GPUs,Scientific Programming

当前位置： X-MOL 学术 › Sci. Program. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Improving Accuracy for Matrix Multiplications on GPUs
Scientific Programming Pub Date : 2011 , DOI: 10.3233/spr-2011-0315
Matthew Badin, Lubomir Bic, Michael Dillencourt, Alexandru Nicolau

Reproducibility of an experiment is a commonly used metric to determine its validity. Within scientific computing, this can become difficult due to the accumulation of floating point rounding errors in the numerical computation, greatly reducing the accuracy of the computation. Matrix multiplication is particularly susceptible to these rounding errors which is why there exist so many solutions, ranging from simulating extra precision to compensated summation algorithms. These solutions however all suffer from the same problem, abysmal performance when compared against the performance of the original algorithm. Graphics cards are particularly susceptible due to a lack of double precision on all but the most recent generation graphics cards, therefore increasing the accuracy of the precision that is offered becomes paramount. By using our method of selectively applying compensated summation algorithms, we are able to return a whole digit of accuracy on current generation graphics cards and potentially two digits of accuracy on the newly released “fermi” architecture. This is all possible with only a 2% drop in performance.

中文翻译：

提高GPU上矩阵乘法的精度

实验的可重复性是确定其有效性的常用指标。在科学计算中，由于在数值计算中浮点舍入误差的累积，这可能变得困难，大大降低了计算的准确性。矩阵乘法特别容易受到这些舍入误差的影响，这就是为什么存在这么多解决方案的原因，从模拟额外精度到补偿求和算法，范围广泛。然而，与原始算法的性能相比，这些解决方案都遭受相同的问题，即糟糕的性能。由于除最新一代图形卡外的所有图形卡均缺乏双重精度，因此图形卡特别容易受到影响，因此提高所提供精度的精度变得至关重要。通过使用我们选择性地应用补偿求和算法的方法，我们能够在当前一代的图形卡上返回整位数的精度，而在最新发布的“ fermi”架构上则可能返回两位数的精度。只需降低2％的性能，这一切都是可能的。

更新日期：2020-09-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11