当前位置: X-MOL 学术arXiv.cs.MS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient algorithms for computing rank-revealing factorizations on a GPU
arXiv - CS - Mathematical Software Pub Date : 2021-06-25 , DOI: arxiv-2106.13402
Nathan Heavner, Chao Chen, Abinand Gopal, Per-Gunnar Martinsson

Standard rank-revealing factorizations such as the singular value decomposition and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level-3 BLAS. This paper presents two alternative algorithms for computing a rank-revealing factorization of the form $A = U T V^*$, where $U$ and $V$ are orthogonal and $T$ is triangular. Both algorithms use randomized projection techniques to cast most of the flops in terms of matrix-matrix multiplication, which is exceptionally efficient on the GPU. Numerical experiments illustrate that these algorithms achieve an order of magnitude acceleration over finely tuned GPU implementations of the SVD while providing low-rank approximation errors close to that of the SVD.

中文翻译:

用于在 GPU 上计算秩揭示因式分解的高效算法

标准的秩显示分解,例如奇异值分解和列旋转 QR 分解,很难在 GPU 上有效实现。这方面的一个主要困难是标准算法无法根据 Level-3 BLAS 进行大多数操作。本文提出了两种替代算法,用于计算 $A = UTV^*$ 形式的秩揭示因式分解,其中 $U$ 和 $V$ 是正交的,而 $T$ 是三角形的。两种算法都使用随机投影技术在矩阵乘法方面投射大部分触发器,这在 GPU 上非常有效。数值实验表明,这些算法在 SVD 的微调 GPU 实现上实现了一个数量级的加速,同时提供了接近 SVD 的低秩近似误差。
更新日期:2021-06-28
down
wechat
bug