当前位置: X-MOL 学术J. Supercomput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Tall-and-skinny QR factorization with approximate Householder reflectors on graphics processors
The Journal of Supercomputing ( IF 2.5 ) Pub Date : 2020-01-24 , DOI: 10.1007/s11227-020-03176-3
Andrés E. Tomás , Enrique S. Quintana-Ortí

We present a novel method for the QR factorization of large tall-and-skinny matrices that introduces an approximation technique for computing the Householder vectors. This approach is very competitive on a hybrid platform equipped with a graphics processor, with a performance advantage over the conventional factorization due to the reduced amount of data transfers between the graphics accelerator and the main memory of the host. Our experiments show that, for tall–skinny matrices, the new approach outperforms the code in MAGMA by a large margin, while it is very competitive for square matrices when the memory transfers and CPU computations are the bottleneck of the Householder QR factorization.

中文翻译:

在图形处理器上使用近似 Householder 反射器进行高瘦 QR 分解

我们提出了一种用于大型高瘦矩阵的 QR 分解的新方法,该方法引入了一种用于计算 Householder 向量的近似技术。这种方法在配备图形处理器的混合平台上非常具有竞争力,由于减少了图形加速器和主机主存储器之间的数据传输量,因此比传统的分解具有性能优势。我们的实验表明,对于高瘦矩阵,新方法在很大程度上优于 MAGMA 中的代码,而当内存传输和 CPU 计算是 Householder QR 分解的瓶颈时,它对于方阵非常有竞争力。
更新日期:2020-01-24
down
wechat
bug