当前位置: X-MOL 学术arXiv.cs.MS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Computing rank-revealing factorizations of matrices stored out-of-core
arXiv - CS - Mathematical Software Pub Date : 2020-02-17 , DOI: arxiv-2002.06960
Nathan Heavner, Per-Gunnar Martinsson, Gregorio Quintana-Ort\'i

This paper describes efficient algorithms for computing rank-revealing factorizations of matrices that are too large to fit in RAM, and must instead be stored on slow external memory devices such as solid-state or spinning disk hard drives (out-of-core or out-of-memory). Traditional algorithms for computing rank revealing factorizations, such as the column pivoted QR factorization, or techniques for computing a full singular value decomposition of a matrix, are very communication intensive. They are naturally expressed as a sequence of matrix-vector operations, which become prohibitively expensive when data is not available in main memory. Randomization allows these methods to be reformulated so that large contiguous blocks of the matrix can be processed in bulk. The paper describes two distinct methods. The first is a blocked version of column pivoted Householder QR, organized as a "left-looking" method to minimize the number of write operations (which are more expensive than read operations on a spinning disk drive). The second method results in a so called UTV factorization which expresses a matrix $A$ as $A = U T V^*$ where $U$ and $V$ are unitary, and $T$ is triangular. This method is organized as an algorithm-by-blocks, in which floating point operations overlap read and write operations. The second method incorporates power iterations, and is exceptionally good at revealing the numerical rank; it can often be used as a substitute for a full singular value decomposition. Numerical experiments demonstrate that the new algorithms are almost as fast when processing data stored on a hard drive as traditional algorithms are for data stored in main memory. To be precise, the computational time for fully factorizing an $n\times n$ matrix scales as $cn^{3}$, with a scaling constant $c$ that is only marginally larger when the matrix is stored out of core.

中文翻译:

计算存储在核外的矩阵的秩揭示因式分解

本文描述了用于计算矩阵的秩揭示因式分解的有效算法,这些矩阵太大而无法放入 RAM 中,而必须存储在慢速外部存储设备上,例如固态或旋转磁盘硬盘驱动器(核外或外内存)。用于计算秩揭示因式分解的传统算法,例如列旋转 QR 因式分解,或用于计算矩阵的完整奇异值分解的技术,是非常需要交流的。它们自然地表示为一系列矩阵向量运算,当主内存中没有数据时,这些运算会变得非常昂贵。随机化允许重新制定这些方法,以便可以批量处理矩阵的大连续块。该论文描述了两种不同的方法。第一个是列枢轴 Householder QR 的阻塞版本,组织为“左看”方法以最小化写入操作的数量(这比旋转磁盘驱动器上的读取操作更昂贵)。第二种方法导致所谓的 UTV 分解,它将矩阵 $A$ 表示为 $A = UTV^*$,其中 $U$ 和 $V$ 是幺正的,而 $T$ 是三角形的。这种方法是按块组织算法的,其中浮点操作与读写操作重叠。第二种方法结合了幂迭代,特别擅长揭示数值秩;它通常可以用作完全奇异值分解的替代品。数值实验表明,新算法在处理存储在硬盘驱动器上的数据时几乎与传统算法处理存储在主存储器中的数据一样快。准确地说,完全因式分解 $n\times n$ 矩阵的计算时间按 $cn^{3}$ 缩放,缩放常数 $c$ 在矩阵存储在核心外时仅稍微大一点。
更新日期:2020-03-05
down
wechat
bug