Least upper bound of truncation error of low-rank matrix approximation algorithm using QR decomposition with pivoting

Kawamura, Haruka; Suda, Reiji

doi:10.1007/s13160-021-00459-x

Least upper bound of truncation error of low-rank matrix approximation algorithm using QR decomposition with pivoting

Original Paper
Open access
Published: 24 February 2021

Volume 38, pages 757–779, (2021)
Cite this article

Download PDF

You have full access to this open access article

Japan Journal of Industrial and Applied Mathematics Aims and scope Submit manuscript

Least upper bound of truncation error of low-rank matrix approximation algorithm using QR decomposition with pivoting

Download PDF

2353 Accesses
2 Altmetric
Explore all metrics

Abstract

Low-rank approximation by QR decomposition with pivoting (pivoted QR) is known to be less accurate than singular value decomposition (SVD); however, the calculation amount is smaller than that of SVD. The least upper bound of the ratio of the truncation error, defined by $\Vert A-BC\Vert _2$, using pivoted QR to that using SVD is proved to be $\sqrt{\frac{4^k-1}{3}(n-k)+1}$ for $A\in {\mathbb {R}}^{m\times n}$ $(m\ge n)$, approximated as a product of $B\in {\mathbb {R}}^{m\times k}$ and $C\in {\mathbb {R}}^{k\times n}$ in this study.

Single-pass randomized QLP decomposition for low-rank approximation

Article 16 November 2022

Efficient preconditioning for noisy separable nonnegative matrix factorization problems by successive projection based low-rank approximations

Article 26 October 2017

A Quadratically Convergent Algorithm for Structured Low-Rank Approximation

Article 11 March 2015

1 Introduction

1.1 Low-rank approximation

Low-rank matrix approximation involves approximating a matrix by a matrix whose rank is less than that of the original matrix. Let $A\in {\mathbb {R}}^{m\times n}$; then, a rank k approximation of A is given by

$$A \approx BC$$

where $B\in {\mathbb {R}}^{m\times k}$ and $C\in {\mathbb {R}}^{k\times n}$. Low-rank matrix approximation appears in many applications such as data mining [5] and machine learning [14]. It also plays an important role in tensor decompositions [12].

This paper discusses truncation errors of low-rank matrix approximation using QR decomposition with pivoting, or pivoted QR. In this study, rounding errors are not considered, and the norm used is basically 2-norm. $A\in {\mathbb {R}}^{m\times n}$ (without loss of generality, we assume that $m\ge n$) is approximated by a product of $B\in {\mathbb {R}}^{m\times k}$ and $C\in {\mathbb {R}}^{k\times n}$, and the truncation error is defined by $\Vert A-BC\Vert _2$.

It is well-known that for any matrix $A\in {\mathbb {R}}^{m\times n}$ ($m\ge n$), there are orthogonal matrices $U \in {\mathbb {R}}^{m\times m}$ and $V\in {\mathbb {R}}^{n\times n}$ and a diagonal matrix $\varSigma \in {\mathbb {R}}^{n\times n}$ with nonnegative diagonal elements that satisfy

$$\begin{aligned} A = U\begin{pmatrix} \varSigma \\ O \end{pmatrix}V^T.\ \end{aligned}$$

This is a singular value decomposition (SVD) of A. We define $\sigma _i(A)$ for $i = 1$, 2, ..., n, satisfying

$$\begin{aligned} {\mathrm{diag }}(\sigma _1(A),\sigma _2(A),\dots ,\sigma _n(A)) = \varSigma , \end{aligned}$$

and assume that $\sigma _1(A) \ge \sigma _2(A) \ge \dots \ge \sigma _n(A) \ge 0$ without loss of generality. The $\sigma _i$ values are singular values of A. A has rank k if and only if $\sigma _k(A) > 0 = \sigma _{k+1}(A)$. Let

$$\begin{aligned} \varSigma _k = {\mathrm{diag}}(\sigma _1(A),\sigma _2(A),\dots ,\sigma _k(A)). \end{aligned}$$

Then,

$$\begin{aligned} \min _{{\mathrm{rank}}(X)\le k}\Vert A-X\Vert _2&=\left\| A-U\begin{pmatrix} \varSigma _k &{}\quad O\\ O &{}\quad O \end{pmatrix}V^T\right\| _2\\&=\sigma _{k+1}(A) \end{aligned}$$

holds [8]. Therefore, this is an A’s rank-k approximation whose 2-norm of truncation error is the smallest. We define the truncation error of low-rank approximation by SVD as

$$\begin{aligned} SVD_k(A) = \sigma _{k+1}(A). \end{aligned}$$

The amount of computation required to calculate SVD is $O(nm \min (n,m))$.

Pivoted QR was proposed by Golub in 1965 [7]. Because the amount of computation required to calculate the low-rank approximation by pivoted QR is O(nmk), it is cheaper than SVD and hence useful in many applications such as solving rank-deficient least squares problems [2]. It consists of QR decomposition and pivoting. For any matrix A, there exist $Q\in {\mathbb {R}}^{m\times n}$ and an upper triangular matrix $R\in {\mathbb {R}}^{n\times n}$ that satisfy $A=QR$ and $Q^TQ=I_n$. This is a QR decomposition of A. We use pivoting to determine the permutation matrix $\varPi _{grd}$ and apply the QR decomposition algorithm to $A\varPi _{grd}$. The subscript grd signifies the greedy method, as explained previously. Hereafter, we redefine QR as a QR decomposition of $A\varPi _{grd}=QR$. Let Q and R be partitioned as

$$\begin{aligned}Q=\begin{pmatrix} Q_{1k}&Q_{2k} \end{pmatrix}, R=\begin{pmatrix} R_{1k} &{}\quad R_{2k}\\ O &{}\quad R_{3k} \end{pmatrix} \end{aligned}$$

where $Q_{1k}\in {\mathbb {R}}^{m\times k}$ and $R_{1k}\in {\mathbb {R}}^{k\times k}$. Then, we can approximate A to $Q_{1k}\begin{pmatrix} R_{1k}&R_{2k} \end{pmatrix}\varPi _{grd}^T$ and

$$\begin{aligned}\Vert A-Q_{1k}\begin{pmatrix} R_{1k}&R_{2k} \end{pmatrix}\varPi _{grd}^T\Vert _2 = \Vert R_{3k}\Vert _2 \end{aligned}$$

holds. We define the truncation error of low-rank approximation by pivoted QR as

$$\begin{aligned} pivotQR_k(A) = \Vert R_{3k}\Vert _2. \end{aligned}$$

In this study, the greedy method is used to make $\Vert R_{3k}\Vert _2$ small in pivoting. Pivoting is performed such that the elements in $R = (r_{ij})$ satisfy the following inequalities [1, p.103]

$$\begin{aligned} r_{ll}^2 \ge \sum _{i=l}^jr_{ij}^2 \quad (l=1,2,\dots ,n-1,\ j=l+1,l+2,\dots ,n). \end{aligned}$$

(1)

Condition (1) is not used to analyze the error for $l=k+1$, $k+2$, ..., $n-1$.

The greedy method of pivoting is not always optimal. QR decompositions of $A\varPi _{RR}$, where $\varPi _{RR}$ is chosen such that $R_{RR}$ has a small lower right block and where $Q_{RR}R_{RR}$ is a QR decomposition of $A\varPi _{RR}$, are called rank-revealing QR (RRQR). The following theorem was shown by Hong et al. in 1992 [9].

Theorem 1

Let $m\ge n > k$, and $A\in {\mathbb {R}}^{m\times n}$. Then, there exists a permutation matrix $\varPi \in {\mathbb {R}}^{n\times n}$ such that the diagonal blocks of $R = \begin{pmatrix}R_1 &{}\quad R_2\\ O &{}\quad R_3\end{pmatrix}$, the upper triangular factor of the QR decomposition of $A\varPi$ with $R_1\in {\mathbb {R}}^{k\times k}$, satisfy the following inequality:

$$\begin{aligned} \Vert R_3\Vert _2 \le \sqrt{k(n-k)+\min (k,n-k)} \ \sigma _{k+1}(A). \end{aligned}$$

Finding the optimal permutation matrix is not practical from the viewpoint of computational complexity.

1.2 Truncation error of pivoted QR

Pivoted QR sometimes results in a large truncation error. A well-known example was shown by Kahan, whose work we do not reproduce here [10]. In 1968, Faddeev et al. [6] showed that

$$\begin{aligned} pivotQR_{n-1}(A)\le \frac{\sqrt{4^n+6n-1}}{3} \ SVD_{n-1}(A). \end{aligned}$$

Furthermore,

$$\begin{aligned} pivotQR_{k}(A)\le \frac{n\sqrt{4^k+6k-1}}{3} \ SVD_{k}(A) \end{aligned}$$

holds [3].

However, in a survey in 2017, it was stated that “very little is known in theory about its behaviour” [13, p. 2218] with regard to pivoted QR, thus there is still room for further research on pivoted QR.

Our previous work showed that the least upper bound of the ratio of the truncation error of pivoted QR to that of SVD is $\sqrt{\frac{4^{n-1}+2}{3}}$ in case an $m\times n$ ($m\ge n$) matrix is approximated to a matrix whose rank is $n-1$, i.e., for $k = n - 1$ [11]. The tight upper bound for all k is proved in the rest of this paper.

We assume that all matrices and vectors in this paper are real numbers; however, we can easily extend the discussion in this paper to complex numbers, and the same results can be obtained.

2 Preliminaries

In this section, we define the notations and examine the basic properties to analyze the truncation errors. First, we introduce the concept ${\mathrm{resi}}$.

Proposition 1

[1, p. 16] For $A\in {\mathbb {R}}^{m\times n}$, there exists $X\in {\mathbb {R}}^{n\times m}$ that satisfies

$$\begin{aligned} AXA=A,XAX=X,(AX)^T=AX,(XA)^T=XA \end{aligned}$$

and X is uniquely determined by the four conditions.

Definition 1

For $A\in {\mathbb {R}}^{m\times n}$ ($m\ge n$), the generalized inverse of A is defined by $X\in {\mathbb {R}}^{n\times m}$ that satisfies the four conditions in Proposition 1 and is denoted by $A^{\dagger }$.

The following notation is closely related to the truncation error of pivoted QR.

Definition 2

Let $A\in {\mathbb {R}}^{m\times n}$ ($m\ge n$) and $B\in {\mathbb {R}}^{m \times l}$. We define ${\mathrm{resi}}(A,B)$ as

$$\begin{aligned}{\mathrm{resi}}(A,B) = B-AA^{\dagger }B. \end{aligned}$$

We denote the inner product of two vectors $\varvec{x}$ and $\varvec{y}$ as $(\varvec{x},\varvec{y})$.

Example 1

For $\varvec{x}\in {\mathbb {R}}^{n}$ and $\varvec{y}\in {\mathbb {R}}^{n}$, if $\varvec{x} \ne \varvec{0}$, then the following holds:

$$\begin{aligned} {\mathrm{resi}}(\varvec{x},\varvec{y}) = \varvec{y} - \frac{(\varvec{x},\varvec{y})}{\Vert \varvec{x}\Vert ^2}\varvec{x}. \end{aligned}$$

The following lemma will be used to identify ${\mathrm{resi}}$.

Lemma 1

Let $A\in {\mathbb {R}}^{m\times n}$ $(m\ge n)$ and $B\in {\mathbb {R}}^{m\times l}$. For $X\in {\mathbb {R}}^{n\times l}$,

$$\begin{aligned} A^TAX-A^TB = O \Leftrightarrow {\mathrm{resi}}(A,B) = B - AX \end{aligned}$$

holds.

Proof

If ${\mathrm{resi}}(A,B) = B-AX$ holds, then

$$\begin{aligned} A^TAX-A^TB&= -A^T{\mathrm{resi}}(A,B)=(A^TAA^{\dagger }-A^T)B\\&=(A^T(AA^{\dagger })^T-A^T)B=((AA^{\dagger }A)^T-A^T)B=O \end{aligned}$$

holds. If $A^TAX-A^TB = O$ holds, then

$$\begin{aligned}&{\mathrm{resi}}(A,B) -B+AX = AX-AA^{\dagger }B = AA^{\dagger }AX-AA^{\dagger }B\\&\quad = (AA^{\dagger })^TAX-(AA^{\dagger })^TB = A^{\dagger T}(A^TAX-A^TB) = O \end{aligned}$$

holds. $\square$

Lemma 2

[1, p. 5] Let $A\in {\mathbb {R}}^{m\times n}$ $(m\ge n)$, $\varvec{b}\in {\mathbb {R}}^{m}$, and $\varvec{x}\in {\mathbb {R}}^{n}$. $\Vert \varvec{b}-A\varvec{x}\Vert \le \Vert \varvec{b}-A\varvec{y}\Vert$ holds for any $\varvec{y}\in {\mathbb {R}}^n$ if and only if $A^T(A\varvec{x}-\varvec{b})=\varvec{0}$ holds.

Using Lemmas 1 and 2, we can obtain the following lemma.

Lemma 3

Let $A\in {\mathbb {R}}^{m\times n}$ $(m\ge n)$, $\varvec{b}\in {\mathbb {R}}^{m}$, and $\varvec{x}\in {\mathbb {R}}^{n}$. $\Vert \varvec{b}-A\varvec{x}\Vert \le \Vert \varvec{b}-A\varvec{y}\Vert$ holds for any $\varvec{y}\in {\mathbb {R}}^n$ if and only if ${\mathrm{resi}}(A,\varvec{b}) = \varvec{b} - A\varvec{x}$ holds.

Lemma 4

Let $m\ge n > k$, $A\in {\mathbb {R}}^{m\times n}$, and $B\in {\mathbb {R}}^{m\times l}$. Let A be partitioned as

$$\begin{aligned}A = \begin{pmatrix} A_{1k}&A_{2k} \end{pmatrix}\end{aligned}$$

where $A_{1k}\in {\mathbb {R}}^{m\times k}$. Then,

$$\begin{aligned} {\mathrm{resi}}(A,B)={\mathrm{resi}}({\mathrm{resi}}(A_{1k},A_{2k}),{\mathrm{resi}}(A_{1k},B)) \end{aligned}$$

holds.

Proof

From the definition of ${\mathrm{resi}}$, we can see that

$$\begin{aligned} {\mathrm{resi}}(A_{1k},A_{2k})= & {} A_{2k}-A_{1k}X, \end{aligned}$$

(2)

$$\begin{aligned} {\mathrm{resi}}(A_{1k},B)= & {} B-A_{1k}Y \end{aligned}$$

(3)

and

$$\begin{aligned} {\mathrm{resi}}({\mathrm{resi}}(A_{1k},A_{2k}),{\mathrm{resi}}(A_{1k},B))={\mathrm{resi}}(A_{1k},B)-{\mathrm{resi}}(A_{1k},A_{2k})Z \end{aligned}$$

(4)

hold where $X = A_{1k}^{\dagger }A_{2k}$, $Y = A_{1k}^{\dagger }B$ and $Z = {\mathrm{resi}}(A_{1k},A_{2k})^{\dagger }{\mathrm{resi}}(A_{1k},B)$. Thus,

$$\begin{aligned} {\mathrm{resi}}({\mathrm{resi}}(A_{1k},A_{2k}),{\mathrm{resi}}(A_{1k},B)) = B - A_{1k}Y - A_{2k}Z + A_{1k}XZ \end{aligned}$$

(5)

holds from (2), (3), and (4). Lemma 1 proves

$$\begin{aligned} A_{1k}^TA_{2k}=A_{1k}^TA_{1k}X, \end{aligned}$$

(6)

from (2),

$$\begin{aligned} A_{1k}^TB=A_{1k}^TA_{1k}Y \end{aligned}$$

(7)

from (3), and

$$\begin{aligned} {\mathrm{resi}}(A_{1k},A_{2k})^T({\mathrm{resi}}(A_{1k},B)-{\mathrm{resi}}(A_{1k},A_{2k})Z)=O \end{aligned}$$

(8)

from (4). We can see that

$$\begin{aligned}&A_{1k}^T{\mathrm{resi}}({\mathrm{resi}}(A_{1k},A_{2k}),{\mathrm{resi}}(A_{1k},B)) \nonumber \\&\quad = \ A_{1k}^T(B-A_{1k}Y-A_{2k}Z+A_{1k}XZ) \nonumber \\&\quad = \ O \end{aligned}$$

(9)

from (5), (6), and (7). We can see that

$$\begin{aligned}&A_{2k}^T{\mathrm{resi}}({\mathrm{resi}}(A_{1k},A_{2k}),{\mathrm{resi}}(A_{1k},B)) \nonumber \\&\quad = \ (A_{2k}-A_{1k}X)^T{\mathrm{resi}}({\mathrm{resi}}(A_{1k},A_{2k}),{\mathrm{resi}}(A_{1k},B)) \nonumber \\&\quad = \ {\mathrm{resi}}(A_{1k},A_{2k})^T({\mathrm{resi}}(A_{1k},B)-{\mathrm{resi}}(A_{1k},A_{2k})Z) \nonumber \\&\quad = \ O \end{aligned}$$

(10)

from (2), (4), (8), and (9). Then, (9) and (10) can be combined as

$$\begin{aligned}&\begin{pmatrix} A_{1k}^T\\ A_{2k}^T \end{pmatrix}{\mathrm{resi}}({\mathrm{resi}}(A_{1k},A_{2k}),{\mathrm{resi}}(A_{1k},B)) \nonumber \\&\quad = \ A^T{\mathrm{resi}}({\mathrm{resi}}(A_{1k},A_{2k}),{\mathrm{resi}}(A_{1k},B)) = O. \end{aligned}$$

(11)

Next, (5) can be rewritten as

$$\begin{aligned}{\mathrm{resi}}({\mathrm{resi}}(A_{1k},A_{2k}),{\mathrm{resi}}(A_{1k},B)) = B-A\begin{pmatrix} Y-XZ\\ Z \end{pmatrix}.\end{aligned}$$

From this and (11), we have

$$\begin{aligned}A^T\left( B-A\begin{pmatrix} Y-XZ\\ Z \end{pmatrix}\right) =O.\end{aligned}$$

Application of Lemma 1 to this proves the lemma. $\square$

QR decomposition and ${\mathrm{resi}}$ have the following relation. Note that QR in this lemma is without pivoting.

Lemma 5

Let $m\ge n > l$, $A\in {\mathbb {R}}^{m\times n}$, and $A=QR$ be a QR decomposition partitioned as

$$\begin{aligned}A=\begin{pmatrix} A_{1l}&A_{2l} \end{pmatrix}, \ Q=\begin{pmatrix} Q_{1l}&Q_{2l} \end{pmatrix}, \ R=\begin{pmatrix} R_{1l} &{}\quad R_{2l}\\ O &{}\quad R_{3l} \end{pmatrix} \end{aligned}$$

where $A_{1l}\in {\mathbb {R}}^{m\times l},Q_{1l}\in {\mathbb {R}}^{m\times l},R_{1l}\in {\mathbb {R}}^{l\times l}$. If ${\mathrm{rank}}(A_{1l}) = l$ holds, then

$$\begin{aligned}{\mathrm{resi}}(A_{1l},A_{2l}) = Q_{2l}R_{3l} \end{aligned}$$

holds.

Proof

We have

$$\begin{aligned} A_{1l} = Q_{1l}R_{1l}, \ A_{2l}=Q_{1l}R_{2l} + Q_{2l}R_{3l}. \end{aligned}$$

Let

$$\begin{aligned} X = R_{1l}^{-1}R_{2l}. \end{aligned}$$

Then, we have

$$\begin{aligned} Q_{2l}R_{3l} = A_{2l} - A_{1l}X. \end{aligned}$$

Furthermore,

$$\begin{aligned} A_{1l}^T(A_{2l} - A_{1l}X) = A_{1l}^TQ_{2l}R_{3l} = R_{1l}^TQ_{1l}^TQ_{2l}R_{3l}=R_{1l}^TOR_{3l}=O \end{aligned}$$

holds. Application of Lemma 1 to this proves the lemma. $\square$

Then, we return to pivoted QR. Let

$$\begin{aligned}A\varPi _{grd} = \begin{pmatrix} \varvec{a}_{\pi _1}&\varvec{a}_{\pi _2}&\dots&\varvec{a}_{\pi _n} \end{pmatrix} = \begin{pmatrix} A_{1k}&A_{2k} \end{pmatrix} \end{aligned}$$

where $A_{1k}\in {\mathbb {R}}^{m\times k}$. From Lemma 5, we can see that

$$\begin{aligned} (1)&\Leftrightarrow \left\| \begin{pmatrix} r_{ll}&r_{l+1l}&\dots&r_{nl} \end{pmatrix}^T\right\| ^2 \ge \left\| \begin{pmatrix} r_{lj}&r_{l+1j}&\dots&r_{nj} \end{pmatrix}^T\right\| ^2\\&\Leftrightarrow \left\| Q_{2(l-1)}\begin{pmatrix} r_{ll}&r_{l+1l}&\dots&r_{nl} \end{pmatrix}^T \right\| ^2 \ge \left\| Q_{2(l-1)}\begin{pmatrix} r_{lj}&r_{l+1j}&\dots&r_{nj} \end{pmatrix}^T\right\| ^2\\&\Leftrightarrow \left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_{\pi _1}&\dots&\varvec{a}_{\pi _{l-1}} \end{pmatrix},\varvec{a}_{\pi _l}\right) ^T\right\| ^2 \ge \left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_{\pi _1}&\dots&\varvec{a}_{\pi _{l-1}} \end{pmatrix},\varvec{a}_{\pi _{j}}\right) ^T\right\| ^2 \end{aligned}$$

for $l=1$, 2, ..., k and $j=l+1$, $l+2$, ..., n and

$$\begin{aligned} pivotQR_k(A) = \Vert R_{3k}\Vert _2 = \Vert Q_{2k}R_{3k}\Vert _2 = \Vert {\mathrm{resi}}(A_{1k},A_{2k})\Vert _2 \end{aligned}$$

if ${\mathrm{rank}}(A_{1k}) = k$ holds. The last equation suggests that, as long as ${\mathrm{rank}}(A_{1k}) = k$ holds, the value of $pivotQR_k(A)$ is determined only from $A_{1k}$ and $A_{2k}$, or equivalently from $\varPi _{grd}$, and is independent of how (or in what algorithm) the QR decomposition is computed.

3 Evaluation from above

We bound $\frac{pivotQR_k(A)}{SVD_k(A)}$ from above in this section. Since $pivotQR_k(A) = SVD_k(A) = 0$ holds if ${\mathrm{rank}}(A) \le k$ holds, we only consider the case ${\mathrm{rank}}(A) > k$. Let $A=U\varSigma V^T$ be one SVD. Since $A\varPi _{grd} = U\varSigma (\varPi _{grd}^TV)^T$ and $(\varPi _{grd}^TV)^T(\varPi _{grd}^TV)=I_n$ hold, $U\varSigma (\varPi _{grd}^TV)^T$ is one SVD of $A\varPi _{grd}$. Then, we can see that

$$\begin{aligned}SVD_k(A) = \sigma _{k+1}(A) = \sigma _{k+1}(A\varPi _{grd}). \end{aligned}$$

Hereafter, we change what A represents. The previous $A\varPi _{grd}$ is replaced by A. Let $A\in {\mathbb {R}}^{m\times n}$ that satisfies

$$\begin{aligned}&\left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_{1}&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_{i}\right) \right\| \nonumber \\&\quad \ge \ \left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_{1}&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_{j}\right) \right\| \ (i=1,\dots ,k, \ j=i+1,\dots ,n) \end{aligned}$$

(12)

be partitioned as

$$\begin{aligned}A = \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_n \end{pmatrix} = \begin{pmatrix} A_{1k}&A_{2k} \end{pmatrix} \end{aligned}$$

where $A_{1k} \in {\mathbb {R}}^{m\times k}$ and ${{\mathrm{rank}}}(A_{1k}) = k$. We should compare $\sigma _{k+1}(A) = SVD_k(A)$ and $\Vert {\mathrm{resi}}(A_{1k},A_{2k})\Vert _2=pivotQR_k(A)$.

Lemma 6

Let $m\ge n$, $A\in {\mathbb {R}}^{m\times n}$, and $B\in {\mathbb {R}}^{m\times l}$. For any $\varvec{v}\in {\mathbb {R}}^{l}$,

$$\begin{aligned} {\mathrm{resi}}(A,B)\varvec{v} = {\mathrm{resi}}(A,B\varvec{v}) \end{aligned}$$

holds.

Proof

From the definition of ${\mathrm{resi}}$,

$$\begin{aligned} {\mathrm{resi}}(A,B) = B-AA^{\dagger }B \end{aligned}$$

holds. Thus,

$$\begin{aligned} {\mathrm{resi}}(A,B)\varvec{v} = B\varvec{v}-AA^{\dagger }B\varvec{v} = {\mathrm{resi}}(A,B\varvec{v}) \end{aligned}$$

holds. $\square$

We can see that

$$\begin{aligned} \Vert {\mathrm{resi}}(A_{1k},A_{2k})\Vert _2&= \max _{\varvec{z}\in {\mathbb {R}}^{n-k}, \Vert \varvec{z}\Vert =1}\Vert {\mathrm{resi}}(A_{1k},A_{2k})\varvec{z}\Vert \\&=\max _{\varvec{z}\in {\mathbb {R}}^{n-k}, \Vert \varvec{z}\Vert =1}\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert \end{aligned}$$

from the definition of 2-norm and Lemma 6. Now, we introduce an essential theorem of this paper.

Theorem 2

Let $m\ge n > 1$, $A\in {\mathbb {R}}^{m\times n}$, ${\mathrm{rank}}(A)=n$, and A be partitioned as

$$\begin{aligned} A = \begin{pmatrix}\varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_{n}\end{pmatrix}. \end{aligned}$$

We define $\hat{A_i}$ as

$$\begin{aligned}\hat{A_i} = \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1}&\varvec{a}_{i+1}&\dots&\varvec{a}_{n} \end{pmatrix}\end{aligned}$$

for $i=1$, 2, ..., n, and $\varvec{d}_i$ as

$$\begin{aligned} \varvec{d}_i = {\mathrm{resi}}(\hat{A_i},\varvec{a}_i) \end{aligned}$$

for $i=1$, 2, ..., n. Then, $\varvec{d}_i \ne \varvec{0}$ for $i=1$, 2, ..., n and

$$\begin{aligned} \frac{\Vert \varvec{a}_1\Vert }{\Vert \varvec{d}_1\Vert }\le \sum _{i=2}^n\frac{\Vert \varvec{a}_i\Vert }{\Vert \varvec{d}_i\Vert } \end{aligned}$$

hold.

Proof

Since ${\mathrm{rank}}(A) = n$, $\{\varvec{a}_1,\varvec{a}_2,\dots ,\varvec{a}_n\}$ is linearly independent. Because $\varvec{d}_i$ is a linear combination of $\{\varvec{a}_1,\varvec{a}_2,\dots ,\varvec{a}_n\}$ with the coefficient of $\varvec{a}_i$ being 1, $\varvec{d}_i \ne \varvec{0}$ holds for $i=1$, 2, ..., n. From the definition of ${\mathrm{resi}}$,

$$\begin{aligned} \varvec{d}_1 = \varvec{a}_1 - \hat{A_1}\varvec{x}_1 \end{aligned}$$

holds, where $\varvec{x}_1 = \hat{A_1}^{\dagger }\varvec{a}_1$. Let $\varvec{x}_1 = \begin{pmatrix} x_{12}&x_{13}&\dots&x_{1n} \end{pmatrix}^T$. Let i be one of 2, 3, ..., n. We can see that

$$\begin{aligned} \Vert \varvec{d}_i\Vert \le \left\| \varvec{a}_i - \hat{A_i}\begin{pmatrix} \frac{1}{x_{1i}} \\ -\frac{x_{12}}{x_{1i}} \\ \vdots \\ -\frac{x_{1i-1}}{x_{1i}} \\ -\frac{x_{1i+1}}{x_{1i}} \\ \vdots \\ -\frac{x_{1n}}{x_{1i}} \end{pmatrix}\right\| = \frac{\Vert \varvec{d}_1\Vert }{|x_{1i}|} \end{aligned}$$

holds if $x_{1i} \ne 0$ from Lemma 3. Thus,

$$\begin{aligned} |x_{1i}| \le \frac{\Vert \varvec{d}_1\Vert }{\Vert \varvec{d}_i\Vert } \end{aligned}$$

(13)

holds. This (13) also holds if $x_{1i} = 0$. We define $\varvec{y}\in {\mathbb {R}}^{m}$ as

$$\begin{aligned} \varvec{y} = \varvec{d}_1 + x_{1n}\varvec{a}_n = \varvec{a}_1-\sum _{i=2}^{n-1}x_{1i}\varvec{a}_i. \end{aligned}$$

Since $\{\varvec{a}_1,\varvec{a}_2,\dots ,\varvec{a}_{n-1}\}$ is linearly independent, $\varvec{y}\ne \varvec{0}$ holds. As Lemma 1 gives $\hat{A_1}^T\varvec{d}_1=\varvec{0}$, we have $(\varvec{a}_n,\varvec{d}_1) = 0$. Thus,

$$\begin{aligned} x_{1n} = \frac{(\varvec{a}_n,\varvec{y})}{\Vert \varvec{a}_n\Vert ^2} \end{aligned}$$

holds. We can see that

$$\begin{aligned} \Vert \varvec{d}_n\Vert \le \left\| \varvec{a}_n - \frac{(\varvec{y},\varvec{a}_n)}{\Vert \varvec{y}\Vert ^2}\varvec{y}\right\| \end{aligned}$$

holds from Lemma 3 because $\varvec{y}$ is a linear combination of $\varvec{a}_i$ $(i=1, 2, \dots , n-1)$. Since

$$\begin{aligned} \Vert \varvec{d}_n\Vert ^2= & {} \frac{\Vert \varvec{d}_n\Vert ^2}{\Vert \varvec{d}_1\Vert ^2}\left\| \varvec{y}-\frac{(\varvec{a}_n,\varvec{y})}{\Vert \varvec{a}_n\Vert ^2}\varvec{a}_n\right\| ^2 = \frac{\Vert \varvec{d}_n\Vert ^2}{\Vert \varvec{d}_1\Vert ^2}\Vert \varvec{y}\Vert ^2\left( 1 - \frac{(\varvec{a}_n,\varvec{y})^2}{\Vert \varvec{y}\Vert ^2\Vert \varvec{a}_n\Vert ^2}\right) ,\\&\left\| \varvec{a}_n- \frac{(\varvec{y},\varvec{a}_n)}{\Vert \varvec{y}\Vert ^2}\varvec{y}\right\| ^2 = \Vert \varvec{a}_n\Vert ^2\left( 1 - \frac{(\varvec{a}_n,\varvec{y})^2}{\Vert \varvec{y}\Vert ^2\Vert \varvec{a}_n\Vert ^2}\right) \end{aligned}$$

and $\Vert \varvec{d}_n\Vert > 0$ hold,

$$\begin{aligned} \frac{\Vert \varvec{a}_n\Vert }{\Vert \varvec{d}_n\Vert } \ge \frac{\Vert \varvec{y}\Vert }{\Vert \varvec{d}_1\Vert } \end{aligned}$$

holds. Furthermore, since

$$\begin{aligned} \Vert \varvec{y}\Vert&\ge \Vert \varvec{a}_1\Vert -\sum _{i=2}^{n-1}|x_{1i}|\Vert \varvec{a}_i\Vert \\&\ge \Vert \varvec{a}_1\Vert -\sum _{i=2}^{n-1}\frac{\Vert \varvec{d}_1\Vert }{\Vert \varvec{d}_i\Vert }\Vert \varvec{a}_i\Vert \end{aligned}$$

holds from (13),

$$\begin{aligned} \frac{\Vert \varvec{a}_n\Vert }{\Vert \varvec{d}_n\Vert } \ge \frac{\Vert \varvec{a}_1\Vert }{\Vert \varvec{d}_1\Vert }-\sum _{i=2}^{n-1}\frac{\Vert \varvec{a}_i\Vert }{\Vert \varvec{d}_i\Vert } \end{aligned}$$

holds, and the theorem has been proved. $\square$

We refer to an essential theorem by Hong et al.

Theorem 3

[9, p. 218] Let $m\ge n > l$, $A\in {\mathbb {R}}^{m\times n}$ and $A = QR = U\varSigma V^T$ be a QR decomposition and an SVD, respectively. Let R and V be partitioned as

$$\begin{aligned}R = \begin{pmatrix} R_{1l} &{}\quad R_{2l}\\ O &{}\quad R_{3l} \end{pmatrix},\quad V = \begin{pmatrix} V_{1l} &{}\quad V_{2l}\\ V_{3l} &{}\quad V_{4l} \end{pmatrix} \end{aligned}$$

where $R_{1l}\in {\mathbb {R}}^{l\times l}$ and $V_{1l}\in {\mathbb {R}}^{l\times l}$.

$$\begin{aligned} \Vert R_{3l}\Vert _2 \ \sigma _{n-l}(V_{4l}) \le \sigma _{l+1}(A)\end{aligned}$$

holds.

In the present study, this theorem is only used for $l = n-1$. The following lemma provides an inequality between ${\mathrm{resi}}$ and the singular value.

Lemma 7

Under the same assumptions as Theorem 2,

$$\begin{aligned} 1\le (\sigma _n(A))^2\sum _{i=1}^n\frac{1}{\Vert \varvec{d}_i\Vert ^2} \end{aligned}$$

holds.

Proof

Let $A=U\varSigma V^T$ be an SVD partitioned as

$$\begin{aligned}V = \begin{pmatrix} V_1&\varvec{v}_2 \end{pmatrix}, \ \varvec{v}_2 = \begin{pmatrix} v_{21}&v_{22}&\dots&v_{2n} \end{pmatrix}^T \end{aligned}$$

where $V_1\in {\mathbb {R}}^{n\times (n-1)}$. Let $\varvec{e}_i$ be the ith column of $I_n$ for $i=1$, 2, ..., n. Define a permutation matrix $\varPi _i$ as

$$\begin{aligned}\varPi _i = \begin{pmatrix} \varvec{e}_1&\dots&\varvec{e}_{i-1}&\varvec{e}_{i+1}&\dots&\varvec{e}_n&\varvec{e}_i \end{pmatrix} \end{aligned}$$

for $i=1$, 2, ..., n. Since

$$\begin{aligned}\begin{pmatrix} \hat{A_i}&\varvec{a}_i \end{pmatrix} = A\varPi _i = U\varSigma (\varPi _i^TV)^T \end{aligned}$$

and $(\varPi _i^TV)^T(\varPi _i^TV) = I_n$, $U\varSigma (\varPi _i^TV)^T$ is one SVD of $\begin{pmatrix} \hat{A_i}&\varvec{a}_i \end{pmatrix}$. Let $A\varPi _i = Q_iR_i$ be a QR decomposition partitioned as

$$\begin{aligned}Q_i = \begin{pmatrix} Q_{i1}&\varvec{q}_{i2} \end{pmatrix},R_i = \begin{pmatrix} R_{i1} &{} \varvec{r}_{i2}\\ O &{} r_{i3} \end{pmatrix} \end{aligned}$$

where $Q_{i1}\in {\mathbb {R}}^{m\times (n-1)},R_{i1}\in {\mathbb {R}}^{(n-1)\times (n-1)}$. Using Theorem 3,

$$\begin{aligned} \sigma _n(A) = \sigma _n(A\varPi _i) \ge |v_{2i}| \ |r_{i3}| \end{aligned}$$

holds. We can see that

$$\begin{aligned} \Vert \varvec{d}_i\Vert = \Vert {\mathrm{resi}}(\hat{A_i},\varvec{a}_i)\Vert = \Vert r_{i3}\varvec{q}_{i2}\Vert = |r_{i3}| \end{aligned}$$

holds from Lemma 5. Thus,

$$\begin{aligned} \sigma _n(A) \ge |v_{2i}| \ \Vert \varvec{d}_i\Vert \end{aligned}$$

holds. Then,

$$\begin{aligned} 1 = \sum _{i=1}^n(v_{2i})^2\le (\sigma _n(A))^2\sum _{i=1}^n\frac{1}{\Vert \varvec{d}_i\Vert ^2} \end{aligned}$$

holds. $\square$

Proposition 2

Let $m\ge n > k$ and $A \in {\mathbb {R}}^{m\times n}$ satisfy (12) with being partitioned as

$$\begin{aligned}A = \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_n \end{pmatrix} = \begin{pmatrix} A_{1k}&A_{2k} \end{pmatrix} \end{aligned}$$

where $A_{1k} \in {\mathbb {R}}^{m\times k}$. Let A satisfy ${\mathrm{rank}}(A_{1k})=k$. Then, for all $\varvec{z}\in {\mathbb {R}}^{n-k}$ with $\Vert \varvec{z}\Vert =1$,

$$\begin{aligned} \Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert \le \sqrt{\frac{4^k-1}{3}(n-k)+1} \ \sigma _{k+1}(A) \end{aligned}$$

holds.

Proof

From (12) and Lemma 6, the following holds for $i=1$, 2, ..., k:

$$\begin{aligned} (n-k) \left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_{1}&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_{i}\right) \right\| ^2&\ge \sum _{j=k+1}^n \left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_{1}&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_{j}\right) \right\| ^2 \nonumber \\&=\left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_{1}&\dots&\varvec{a}_{i-1} \end{pmatrix},A_{2k}\right) \right\| _F^2 \nonumber \\&\ge \left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_{1}&\dots&\varvec{a}_{i-1} \end{pmatrix},A_{2k}\right) \right\| _2^2 \nonumber \\&\ge \left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_{1}&\dots&\varvec{a}_{i-1} \end{pmatrix},A_{2k}\varvec{z}\right) \right\| ^2. \end{aligned}$$

(14)

Define $A'$ as

$$\begin{aligned}A' = \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_k&A_{2k}\varvec{z} \end{pmatrix}. \end{aligned}$$

If ${\mathrm{rank}}(A') \ne k+1$, then $\{ \varvec{a}_1, \varvec{a}_2, \dots , \varvec{a}_k, A_{2k}\varvec{z}\}$ is linearly dependent. Since

${\mathrm{rank}}(A_{1k}) = k$, $\{\varvec{a}_1 , \varvec{a}_2 , \dots , \varvec{a}_k\}$ is linearly independent, and $A_{2k}\varvec{z}$ can be expressed as a linear combination of $\{\varvec{a}_1, \varvec{a}_2, \dots , \varvec{a}_k\}$. Then, we have

${\mathrm{resi}}(A_{1k},A_{2k}\varvec{z}) = \varvec{0}$ from Lemma 3, and the conclusion holds. Therefore, we only consider the case ${\mathrm{rank}}(A') = k+1$ in the remainder of this proof. We define $\varvec{d}'_i$ as

$$\begin{aligned} \varvec{d}'_i = {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1}&\varvec{a}_{i+1}&\dots&\varvec{a}_k&A_{2k}\varvec{z} \end{pmatrix},\varvec{a}_i\right) \ (i=1,2,\dots ,k). \end{aligned}$$

From Lemma 4, we can see that

$$\begin{aligned}\varvec{d}'_j = {\mathrm{resi}}\left( {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1} \end{pmatrix},A_{ijk}'\right) ,{\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_j\right) \right) \end{aligned}$$

holds for $i=1$, 2, ..., k and $j=i$, $i+1$, ..., k, where

$A_{ijk}' = \begin{pmatrix} \varvec{a}_{i}&\dots&\varvec{a}_{j-1}&\varvec{a}_{j+1}&\dots&\varvec{a}_k&A_{2k}\varvec{z} \end{pmatrix}$, and

$$\begin{aligned}&{\mathrm{resi}}(A_{1k},A_{2k}\varvec{z}) \\&\quad = \ {\mathrm{resi}}\left( {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1} \end{pmatrix},\begin{pmatrix} \varvec{a}_{i}&\dots&\varvec{a}_{k} \end{pmatrix}\right) ,{\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1} \end{pmatrix},A_{2k}\varvec{z}\right) \right) \end{aligned}$$

holds for $i=1$, 2, ..., k. Using Theorem 2 on ${\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_{i-1} \end{pmatrix},\begin{pmatrix} \varvec{a}_{i}&\varvec{a}_{i+1}&\dots&\varvec{a}_k&A_{2k}\varvec{z} \end{pmatrix}\right)$, we can see that

$$\begin{aligned}&\frac{\left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_i\right) \right\| }{\left\| {\mathrm{resi}}\left( {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1} \end{pmatrix},A_{iik}'\right) ,{\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_i\right) \right) \right\| } \\&\quad \le \ \sum _{j=i+1}^k\frac{\left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_j\right) \right\| }{\left\| {\mathrm{resi}}\left( {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1} \end{pmatrix},A_{ijk}'\right) ,{\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_j\right) \right) \right\| }\\&\qquad + \ \frac{\left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_{i-1} \end{pmatrix},A_{2k}\varvec{z}\right) \right\| }{\left\| {\mathrm{resi}}\left( {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1} \end{pmatrix},\begin{pmatrix} \varvec{a}_{i}&\dots&\varvec{a}_{k} \end{pmatrix}\right) ,{\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\dots&\varvec{a}_{i-1} \end{pmatrix},A_{2k}\varvec{z}\right) \right) \right\| } \end{aligned}$$

holds. Thus,

$$\begin{aligned}&\frac{\left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_i\right) \right\| }{\Vert \varvec{d}'_i\Vert } \\&\quad \le \ \sum _{j=i+1}^k\frac{\left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_j\right) \right\| }{\Vert \varvec{d}'_j\Vert }+\frac{\left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_{i-1} \end{pmatrix},A_{2k}\varvec{z}\right) \right\| }{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert }\\&\quad \le \ \left\| {\mathrm{resi}}\left( \begin{pmatrix} \varvec{a}_1&\varvec{a}_2&\dots&\varvec{a}_{i-1} \end{pmatrix},\varvec{a}_i\right) \right\| \left( \sum _{j=i+1}^k\frac{1}{\Vert \varvec{d}'_j\Vert }+\frac{\sqrt{n-k}}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert }\right) \end{aligned}$$

holds for $i = 1$, 2, ..., k from (12) and (14). Thus,

$$\begin{aligned} \frac{1}{\Vert \varvec{d}'_i\Vert } \le \sum _{j=i+1}^k\frac{1}{\Vert \varvec{d}'_j\Vert }+\frac{\sqrt{n-k}}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert } \quad (i=1,2,\dots ,k) \end{aligned}$$

(15)

holds. We want to show that

$$\begin{aligned} \frac{1}{\Vert \varvec{d}'_i\Vert } \le \frac{2^{k-i}\sqrt{n-k}}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert } \quad (i=1,2,\dots ,k) \end{aligned}$$

(16)

and prove this using induction in the order of $i=k$, $k-1$, ..., 1. Applying (15) for $i=k$ gives

$$\begin{aligned} \frac{1}{\Vert \varvec{d}'_k\Vert } \le \frac{\sqrt{n-k}}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert } = \frac{2^{k-k}\sqrt{n-k}}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert }. \end{aligned}$$

Thus, (16) is shown in case $i=k$. Then, we prove that (16) holds for $i=l$, assuming that (16) holds for $i=l+1$, $l+2$, ..., k. We can see that

$$\begin{aligned} \frac{1}{\Vert \varvec{d}'_l\Vert }&\le \sum _{j=l+1}^k\frac{1}{\Vert \varvec{d}'_j\Vert }+\frac{\sqrt{n-k}}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert }\\&\le \frac{\sqrt{n-k}}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert }\left( \sum _{j=l+1}^k2^{k-j}+1\right) \\&= \frac{2^{k-l}\sqrt{n-k}}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert } \end{aligned}$$

holds from (15) and the assumption of induction. Thus, (16) has been shown in case $i=1$, 2, ..., k. Using Lemma 7 on $A'$,

$$\begin{aligned} 1&\le (\sigma _{k+1}(A'))^2 \left( \sum _{i=1}^k\frac{1}{\Vert \varvec{d}'_i\Vert ^2}+\frac{1}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert ^2}\right) \\&\le \frac{(\sigma _{k+1}(A'))^2}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert ^2} \left( (n-k)\sum _{i=1}^k 4^{k-i}+1\right) \\&= \frac{(\sigma _{k+1}(A'))^2}{\Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert ^2} \left( \frac{4^k-1}{3}(n-k)+1\right) \end{aligned}$$

holds. Thus,

$$\begin{aligned} \Vert {\mathrm{resi}}(A_{1k},A_{2k}\varvec{z})\Vert \le \sqrt{\frac{4^k-1}{3}(n-k)+1} \ \sigma _{k+1}(A') \end{aligned}$$

holds. Now, if we can show that

$$\begin{aligned} \sigma _{k+1}(A') \le \sigma _{k+1}(A), \end{aligned}$$

then the proof is complete. Considering the fact that

$$\begin{aligned} \sigma _{k+1}(A) = \max _{\varTheta ,{\mathrm{dim}}\varTheta = k+1}\min _{\varvec{x}\in \varTheta ,\Vert \varvec{x}\Vert =1}\Vert A\varvec{x}\Vert , \end{aligned}$$

we want a subspace $\varTheta$ that satisfies

$$\begin{aligned} \min _{\varvec{x}\in \varTheta ,\Vert \varvec{x}\Vert =1}\Vert A\varvec{x}\Vert \ge \sigma _{k+1}(A'). \end{aligned}$$

Let

$$\begin{aligned}\varTheta ' = {\mathrm{span}}\left\{ \varvec{e}_1 , \varvec{e}_2 , \dots , \varvec{e}_k , \begin{pmatrix} \varvec{0}\\ \varvec{z} \end{pmatrix} \right\} .\end{aligned}$$

Then, we have ${\mathrm{dim}}(\varTheta ') = k+1$ since $\left\{ \varvec{e}_1 , \varvec{e}_2 , \dots , \varvec{e}_k , \begin{pmatrix} \varvec{0}\\ \varvec{z} \end{pmatrix} \right\}$ is linearly independent. Let $\varvec{y} = (y_i)\in {\mathbb {R}}^{k+1}$. Since $\begin{pmatrix} \varvec{e}_1 &{} \varvec{e}_2 &{} \dots &{} \varvec{e}_k &{} \begin{pmatrix} \varvec{0}\\ \varvec{z} \end{pmatrix} \end{pmatrix}^T\begin{pmatrix} \varvec{e}_1 &{} \varvec{e}_2 &{} \dots &{} \varvec{e}_k &{} \begin{pmatrix} \varvec{0}\\ \varvec{z} \end{pmatrix} \end{pmatrix}=I_{k+1}$ holds,

$$\begin{aligned} \Vert \varvec{y}\Vert = 1 \Leftrightarrow \left\| \sum _{i=1}^ky_i\varvec{e}_i + y_{k+1}\begin{pmatrix} \varvec{0}\\ \varvec{z} \end{pmatrix}\right\| =1 \end{aligned}$$

(17)

holds. For all $\varvec{y}\in {\mathbb {R}}^{k+1}$ that satisfies the right-hand side of (17),

$$\begin{aligned} \left\| A\left( \sum _{i=1}^ky_i\varvec{e}_i + y_{k+1}\begin{pmatrix} \varvec{0}\\ \varvec{z} \end{pmatrix}\right) \right\| = \Vert A'\varvec{y}\Vert \ge \sigma _{k+1}(A') \end{aligned}$$

holds. Then,

$$\begin{aligned} \sigma _{k+1}(A) \ge \min _{\varvec{x}\in \varTheta ',\Vert \varvec{x}\Vert =1}\Vert A\varvec{x}\Vert \ge \sigma _{k+1}(A') \end{aligned}$$

holds. $\square$

Thus, we have proved that

$$\begin{aligned} pivotQR_k(A) \le \sqrt{\frac{4^k-1}{3}(n-k)+1} \ SVD_k(A). \end{aligned}$$

4 Evaluation from below

In this section, we show that the inequality proved in the previous section is tight. An example of matrix $R_h$ with real-valued parameter h that satisfies

$$\begin{aligned} \frac{pivotQR_k(R_h)}{SVD_k(R_h)}\xrightarrow [h\rightarrow 0]{} \sqrt{\frac{4^k-1}{3}(n-k)+1} \end{aligned}$$

is shown. $R_h$ is as follows:

$$\begin{aligned}R_h = \begin{pmatrix}\begin{array}{cccc} 1 &{}\quad 0 &{}\quad \dots &{}\quad 0\\ 0 &{}\quad h &{}\quad \ddots &{}\quad \vdots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad 0\\ 0 &{}\quad \dots &{}\quad 0 &{}\quad h^k \end{array} &{}\quad {O}\\ {O} &{}\quad {O} \end{pmatrix} \begin{pmatrix}\begin{array}{cccccc} 1 &{}\quad -\sqrt{1-h^2} &{}\quad \dots &{}\quad -\sqrt{1-h^2} &{}\quad \dots &{}\quad -\sqrt{1-h^2}\\ 0 &{}\quad 1 &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad \vdots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad -\sqrt{1-h^2} &{}\quad \dots &{}\quad -\sqrt{1-h^2}\\ 0 &{}\quad \dots &{}\quad 0 &{}\quad 1 &{}\quad \dots &{}\quad 1 \end{array}\\ {O} \end{pmatrix} .\end{aligned}$$

The Kahan matrix is [10]

$$\begin{aligned}K_n = \begin{pmatrix} 1 &{}\quad 0 &{}\quad \dots &{}\quad 0\\ 0 &{}\quad h &{}\quad \ddots &{}\quad \vdots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad 0\\ 0 &{}\quad \dots &{}\quad 0 &{}\quad h^{n-1} \end{pmatrix}\begin{pmatrix} 1 &{}\quad -\sqrt{1-h^2} &{}\quad \dots &{}\quad -\sqrt{1-h^2}\\ 0 &{}\quad 1 &{}\quad \ddots &{}\quad \vdots \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad -\sqrt{1-h^2}\\ 0 &{}\quad \dots &{}\quad 0 &{}\quad 1 \end{pmatrix}. \end{aligned}$$

Therefore, $R_h$ is the same as the Kahan matrix in case $m = n = k+1$ and is an extension of the Kahan matrix otherwise.

Proposition 3

Let $m\ge n > k$. Define $\varSigma _h \in {\mathbb {R}}^{m\times n}$, $(w_{hij}) = W_h \in {\mathbb {R}}^{n\times n}$, and $R_h\in {\mathbb {R}}^{m\times n}$ as follows:

$$\begin{aligned}\varSigma _h= & {} \begin{pmatrix} {\mathrm{diag}}(1,h,\dots ,h^{k},0,0,\dots ,0)\\ O \end{pmatrix},\\ w_{hij}= & {} {\left\{ \begin{array}{ll} 1 &{}\quad (i=j \ {\mathrm{and}} \ 1\le i \le k) \ {\mathrm{or}} \ (i=k+1 \ {\mathrm{and}} \ k+1\le j \le n), \\ -\sqrt{1-h^2} &{} (i < j \ {\mathrm{and}} \ 1 \le i \le k), \\ 0 &{}\quad {\mathrm{otherwise}} \end{array}\right. } \end{aligned}$$

and $R_h=\varSigma _h W_h$ where $0< h < 1$. Then,

$$\begin{aligned} \lim _{h\rightarrow 0}\frac{pivotQR_k(R_h)}{SVD_k(R_h)} = \sqrt{\frac{4^k-1}{3}(n-k)+1} \end{aligned}$$

holds.

Proof

Let $Q = \begin{pmatrix} I_n\\ O \end{pmatrix} \in {\mathbb {R}}^{m \times n}$ and $R = {\mathrm{diag}}(1,h,\dots ,h^{k},0,0,\dots ,0)W_h \in {\mathbb {R}}^{n \times n}$. Since R is an upper triangular matrix and $Q^TQ=I_n$ holds, $R_h = QR$ is one QR decomposition. We check (1) for this R. Since

$$\begin{aligned} (\text{ left } \text{ side } \text{ of } \text{(1) }) =&\ h^{2l-2} = (1-h^2)\sum _{i=l}^{\min (j, k + 1)-1}h^{2i-2} + h^{2\min (j , k + 1)-2} \\ =&\ (\text{ right } \text{ side } \text{ of } \text{(1) }), \end{aligned}$$

(1) holds for $l=1$, 2, ..., $k+1$, $j=l+1$, $l+2$, ..., n. Obviously (1) also holds for $l=k+2$, $k+3$, ..., $n-1$, $j=l+1$, $l+2$, ..., n. As in Sect. 2, let R be partitioned as

$$\begin{aligned}R = \begin{pmatrix} R_{1k} &{}\quad R_{2k}\\ O &{}\quad R_{3k} \end{pmatrix} \end{aligned}$$

where $R_{1k} \in {\mathbb {R}}^{k\times k}$. Then,

$$\begin{aligned} pivotQR_k(R_h) = \Vert R_{3k}\Vert _2 \end{aligned}$$

holds. Define $V\in {\mathbb {R}}^{(n-k)\times (n-k)}$ and $\varvec{v}_1\in {\mathbb {R}}^{n-k}$ as follows:

$$\begin{aligned}V = \begin{pmatrix} \varvec{v}_1&\varvec{v}_2&\dots&\varvec{v}_{n-k} \end{pmatrix},\quad \varvec{v}_1 = \frac{1}{\sqrt{n-k}}\begin{pmatrix} 1&1&\dots&1 \end{pmatrix}^T \end{aligned}$$

where $\varvec{v}_2$, $\varvec{v}_3$, ..., $\varvec{v}_{n-k}$ are chosen such that $V^TV = I_{n-k}$ holds. We can choose them freely as long as this is satisfied. Since

$$\begin{aligned} R_{3k} = h^{k}\sqrt{n-k}\begin{pmatrix} \varvec{v}_1&\varvec{0}&\dots&\varvec{0} \end{pmatrix}^T \end{aligned}$$

holds, $\Vert R_{3k}\Vert _2 = h^{k}\sqrt{n-k}$ holds. We consider the value of $SVD_k(R_h) = \sigma _{k+1}(R_h)$. Considering the fact that

$$\begin{aligned} \sigma _{k+1}(R_h) = \min _{\varTheta ,{\mathrm{dim}}\varTheta = n-k}\max _{\varvec{x}\in \varTheta ,\Vert \varvec{x}\Vert =1}\Vert R_h\varvec{x}\Vert , \end{aligned}$$

we want a subspace $\varTheta$ whose $\max _{\varvec{x}\in \varTheta ,\Vert \varvec{x}\Vert =1}\Vert R_h\varvec{x}\Vert$ is small. Since $\varvec{v}_1^T\varvec{v}_i=0$ holds for $i=2$, 3, ..., $n-k$,

$$\begin{aligned}R_h\begin{pmatrix} \varvec{0}\\ \varvec{v}_i \end{pmatrix} = \begin{pmatrix} R_{2k}\\ R_{3k}\\ O \end{pmatrix} \varvec{v}_i = \begin{pmatrix} -\sqrt{(n-k)(1-h^2)} \ h^0\\ -\sqrt{(n-k)(1-h^2)} \ h^1\\ \vdots \\ -\sqrt{(n-k)(1-h^2)} \ h^{k-1}\\ \sqrt{n-k} \ h^k\\ 0\\ 0\\ \vdots \\ 0 \end{pmatrix}\varvec{v}_1^T\varvec{v}_i = \varvec{0} \end{aligned}$$

holds for $i=2$, 3, ..., $n-k$. We define $y_j = 1$ for $j=k+1$, ..., n and define $y_j$ from $j=k$ down to $j=1$ as $y_j = \sqrt{1-h^2}\sum _{i=j+1}^n y_i$. We define $\varvec{y}\in {\mathbb {R}}^n$ as $\begin{pmatrix} y_1&y_2&\dots y_n \end{pmatrix}^T$. Then,

$$\begin{aligned}R_h\varvec{y} = \begin{pmatrix} 0&0&\dots&0&(n-k)h^k&0&0&\dots&0 \end{pmatrix}^T \end{aligned}$$

holds. Since

$$\begin{aligned} \lim _{h\rightarrow 0}\Vert \varvec{y}\Vert&= \left\| \begin{pmatrix} (n-k)2^{k-1}&(n-k)2^{k-2}&\dots&(n-k)2^0&1&1&\dots&1 \end{pmatrix}\right\| \\&= \sqrt{\frac{4^k-1}{3}(n-k)^2+n-k} \end{aligned}$$

holds,

$$\begin{aligned} \lim _{h\rightarrow 0}\frac{1}{h^k}\left\| R_h\frac{\varvec{y}}{\Vert \varvec{y}\Vert }\right\| = \frac{\sqrt{n-k}}{\sqrt{\frac{4^k-1}{3}(n-k)+1}} \end{aligned}$$

holds. Let

$$\begin{aligned}\varTheta ' = {\mathrm{span}}\left\{ \frac{\varvec{y}}{\Vert \varvec{y}\Vert } , \begin{pmatrix} \varvec{0}\\ \varvec{v}_2 \end{pmatrix} , \begin{pmatrix} \varvec{0}\\ \varvec{v}_3 \end{pmatrix} , \dots , \begin{pmatrix} \varvec{0}\\ \varvec{v}_{n-k} \end{pmatrix} \right\} . \end{aligned}$$

Then, we have ${\mathrm{dim}}(\varTheta ') = n-k$. Since

$$\begin{aligned}&\begin{pmatrix} \frac{\varvec{y}}{\Vert \varvec{y}\Vert } &{} \begin{pmatrix} \varvec{0}\\ \varvec{v}_2 \end{pmatrix} &{} \begin{pmatrix} \varvec{0}\\ \varvec{v}_3 \end{pmatrix} &{} \dots &{}\begin{pmatrix} \varvec{0}\\ \varvec{v}_{n-k} \end{pmatrix} \end{pmatrix}^T\begin{pmatrix} \frac{\varvec{y}}{\Vert \varvec{y}\Vert } &{} \begin{pmatrix} \varvec{0}\\ \varvec{v}_2 \end{pmatrix} &{} \begin{pmatrix} \varvec{0}\\ \varvec{v}_3 \end{pmatrix} &{} \dots &{}\begin{pmatrix} \varvec{0}\\ \varvec{v}_{n-k} \end{pmatrix} \end{pmatrix}\\&\quad = I_{n-k} \end{aligned}$$

holds, we have

$$\begin{aligned} \sum _{i=1}^{n-k}z_i^2=1\Leftrightarrow \left\| z_{1}\frac{\varvec{y}}{\Vert \varvec{y}\Vert } + \sum _{i=2}^{n-k}z_i\begin{pmatrix} \varvec{0}\\ \varvec{v}_i \end{pmatrix}\right\| =1 \end{aligned}$$

(18)

for $z_1$, $z_2$, ..., $z_{n-k}\in {\mathbb {R}}$. Thus, if the right-hand side of (18) holds, we have

$$\begin{aligned}&\frac{1}{h^k}\left\| R_h\left( z_{1}\frac{\varvec{y}}{\Vert \varvec{y}\Vert } + \sum _{i=2}^{n-k}z_i\begin{pmatrix} \varvec{0}\\ \varvec{v}_i \end{pmatrix}\right) \right\| \\&\quad = \ \frac{|z_{1}|}{h^k} \ \left\| R_h\frac{\varvec{y}}{\Vert \varvec{y}\Vert }\right\| \le \frac{1}{h^k}\left\| R_h\frac{\varvec{y}}{\Vert \varvec{y}\Vert }\right\| \xrightarrow [h\rightarrow 0]{}\frac{\sqrt{n-k}}{\sqrt{\frac{4^k-1}{3}(n-k)+1}}. \end{aligned}$$

Then, we have

$$\begin{aligned} \lim _{h\rightarrow 0}\frac{\sigma _{k+1}(R_h)}{h^k} \le \lim _{h\rightarrow 0}\frac{1}{h^k}\max _{\varvec{x}\in \varTheta ',\Vert \varvec{x}\Vert =1}\Vert R_h\varvec{x}\Vert \le \frac{\sqrt{n-k}}{\sqrt{\frac{4^k-1}{3}(n-k)+1}}. \end{aligned}$$

Thus,

$$\begin{aligned} \lim _{h\rightarrow 0}\frac{pivotQR_k(R_h)}{SVD_k(R_h)}\ge \sqrt{\frac{4^k-1}{3}(n-k)+1} \end{aligned}$$

holds, and the theorem has been proved using the results of the previous section. $\square$

5 Numerical experiments

5.1 Experiments

Because $SVD_{k}(R_h)$ cannot be calculated numerically when h is very small, we prepare another matrix for the experiments. Here, we present the matrix in case $k=n-1$.

Proposition 4

[11] Let $m\ge n$ and $\epsilon _i\in {\mathbb {R}}$ satisfy $0<\epsilon _i < 1$ $(i=1, 2, \dots , n)$. Let $\{\varvec{v}_i\}_{i=1}^n\in {\mathbb {R}}^n$ be

$$\begin{aligned} v_{j,i}={\left\{ \begin{array}{ll} 0 \ (j < i)\\ -1-\epsilon _i \ (j=i)\\ 1-\epsilon _i \ (j > i) \end{array}\right. } \end{aligned}$$

where $v_{j,i}$ is the j-th element of $\varvec{v}_i$. Let $\{\varvec{w}_i\}_{i=1}^n$ be the output of performing the Gram-Schmidt orthonormalization on $\{\varvec{v}_i\}_{i=1}^n$. Define $\varSigma \in {\mathbb {R}}^{m\times n}$ and $W \in {\mathbb {R}}^{n\times n}$ as follows:

$$\begin{aligned}\varSigma = \begin{pmatrix} {\mathrm{diag}}(\sigma _1,\sigma _2,\dots ,\sigma _{n})\\ O \end{pmatrix},\\W = \begin{pmatrix} \varvec{w}_1&\varvec{w}_2&\dots&\varvec{w}_n \end{pmatrix} ,\end{aligned}$$

where $\sigma _1 \ge \sigma _2 \ge \dots \ge \sigma _n > 0$. Let $A=\varSigma W^T$. Then,

$$\begin{aligned} \lim _{(\epsilon _1,\epsilon _2,\dots ,\epsilon _n)\rightarrow (0,\dots ,0)}\lim _{\sigma _{n-1}\rightarrow \infty }\lim _{\sigma _{n-2}\rightarrow \infty }\dots \lim _{\sigma _{1}\rightarrow \infty }\frac{pivotQR_{n-1}(A)}{SVD_{n-1}(A)} = \sqrt{\frac{4^{n-1}+2}{3}} \end{aligned}$$

holds.

This proposition introduces matrices that converge to the least upper bound only in case $k = n-1$. In this paper, matrices that conjecturally converge to the least upper bound in case $k < n-1$ are defined in an analogous manner as follows.

Conjecture 1

Let $m\ge n > k$ and $\epsilon _i\in {\mathbb {R}}$ satisfy $0< \epsilon _i < 1$ $(i=1, 2, \dots , n)$. Let $\{\varvec{v}_i\}_{i=1}^n\in {\mathbb {R}}^n$ be

$$\begin{aligned}v_{j,i}={\left\{ \begin{array}{ll} -1-\epsilon _i &{}\quad (j = i) \ {\mathrm{or}} \ ((i = k+1) \ {\mathrm{and}} \ (k+1 \le j))\\ 1-\epsilon _i &{}\quad (j > i) \ {\mathrm{and}} \ (i \le k)\\ 0 &{}\quad ({\mathrm{otherwise}}) \end{array}\right. } \end{aligned}$$

where $v_{j,i}$ is the j-th element of $\varvec{v}_i$. Let $\{\varvec{w}_i\}_{i=1}^n$ be the output of performing the Gram-Schmidt orthonormalization on $\{\varvec{v}_i\}_{i=1}^n$. Define $\varSigma \in {\mathbb {R}}^{m\times n}$ and $W \in {\mathbb {R}}^{n\times n}$ as follows:

$$\begin{aligned}\varSigma= & {} \begin{pmatrix} {\mathrm{diag}}(\sigma _1,\dots ,\sigma _{k+1},0,\dots ,0)\\ O \end{pmatrix},\\ W= & {} \begin{pmatrix} \varvec{w}_1&\varvec{w}_2&\dots&\varvec{w}_n \end{pmatrix}, \end{aligned}$$

where $\sigma _1 \ge \sigma _2 \ge \dots \ge \sigma _{k+1} > 0$. Let $A=\varSigma W^T$. Then,

$$\begin{aligned} \lim _{(\epsilon _1,\epsilon _2,\dots ,\epsilon _n)\rightarrow (0,\dots ,0)}\lim _{\sigma _{k}\rightarrow \infty }\lim _{\sigma _{k-1}\rightarrow \infty }\dots \lim _{\sigma _{1}\rightarrow \infty }\frac{pivotQR_{k}(A)}{SVD_{k}(A)} = \sqrt{\frac{4^k-1}{3}(n-k)+1} \end{aligned}$$

holds. $\square$

$\{\varvec{v}_i\}_{i=1}^n$ is as follows:

$$\begin{aligned} \begin{pmatrix} \varvec{v}_1&\varvec{v}_2&\dots&\varvec{v}_n \end{pmatrix} = \begin{pmatrix} -1-\epsilon _1 &{} 1-\epsilon _1 &{} \dots &{} \dots &{} 1-\epsilon _1 &{} \dots &{} 1-\epsilon _1\\ 0 &{} -1-\epsilon _2 &{} 1-\epsilon _2 &{} \dots &{} 1-\epsilon _2 &{} \dots &{} 1-\epsilon _2\\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} \ddots &{} \ddots &{} \ddots \\ \vdots &{} \ddots &{} \ddots &{} -1-\epsilon _k &{} 1-\epsilon _k &{} \dots &{} 1-\epsilon _k\\ \vdots &{} \ddots &{} \ddots &{} 0 &{} -1-\epsilon _{k+1} &{} \dots &{} -1-\epsilon _{k+1}\\ \vdots &{} \ddots &{} \ddots &{} 0 &{} 0 &{} \ddots &{} 0\\ 0 &{} \ddots &{} \ddots &{} 0 &{} 0 &{} 0 &{} -1-\epsilon _{n}\\ \end{pmatrix}. \end{aligned}$$

Because the limits cannot be taken in numerical experiments, we assign $s^{n-i}$ to $\sigma _i$ for $i=1$, 2, ..., $k+1$ and $s^{-1}$ to $\epsilon _i$ for $i=1$, 2, ..., n. Numerical experiments were performed for $n=2$, 3, ..., 25 and $k=1$, 2, ..., $n-1$ with $m = n$. We calculate u as follows:

$$\begin{aligned} u = \frac{\sqrt{\frac{4^k-1}{3}(n-k)+1}}{\frac{pivotQR_{k}(A)}{SVD_{k}(A)}}-1. \end{aligned}$$

Because we want $\frac{pivotQR_{k}(A)}{SVD_{k}(A)}\rightarrow \sqrt{\frac{4^k-1}{3}(n-k)+1}$ as $s \rightarrow \infty$, we want $u \rightarrow 0$ as $s \rightarrow \infty$.

5.2 Environment

Python 3.5.2, scipy 1.4.1, and numpy 1.13.3 were used. The data type used in these experiments was double precision. We used SVD of numpy and pivoted QR of scipy. Gram-Schmidt orthonormalization was performed twice using the scipy QR decomposition. Let $V = \begin{pmatrix} \varvec{v}_1&\dots&\varvec{v}_n \end{pmatrix}$ and $V = QR$ and $Q = WR'$ be QR decompositions. The i-th column of W is assigned to $\varvec{w}_i$ for $i=1$, 2, ..., n.

5.3 Results

The maximum value of u is $2.3482077194\cdot 10^{-5}$ and $u \ge 0$ holds in all cases where $s = 10^9$. The results of cases $m = n = 2$, 3, ..., 25, $k = n-1$, $\lfloor \frac{n}{2}\rfloor$, $\lfloor \log _2 n\rfloor$, and $s = 10^9$ are shown in Fig. 1. We can see that it almost approaches the least upper bound, and u increases monotonically with n and k. The results of cases $m = n = 25$, $k = n-1$, $\lfloor \frac{n}{2}\rfloor$, $\lfloor \log _2 n\rfloor$, and $s=10$, $10^2$, ..., $10^9$ are shown in Fig. 2. We can see that u decreases monotonically with s. From these results, we can see that numerical solutions are likely to converge to the least upper bound.

6 Conclusion

We compare the 2-norm of the truncation error of pivoted QR to that of SVD. We obtain the following theorem:

Theorem 4

Let $m\ge n > k$. For any $A\in {\mathbb {R}}^{m\times n}$,

$$\begin{aligned} pivotQR_k(A)\le \sqrt{\frac{4^k-1}{3}(n-k)+1} \ SVD_k(A) \end{aligned}$$

holds. Furthermore, for $t\in {\mathbb {R}}$ that satisfies $t < \sqrt{\frac{4^k-1}{3}(n-k)+1}$, there exists $A\in {\mathbb {R}}^{m\times n}$ that satisfies

$$\begin{aligned} pivotQR_k(A) > t \ SVD_k(A). \end{aligned}$$

$\square$

This proposition states that the least upper bound of the ratio of the truncation error of pivoted QR to that of SVD is $\sqrt{\frac{4^k-1}{3}(n-k)+1}$ when an $m\times n$ $(m\ge n)$ matrix is approximated to a matrix whose rank is k. Furthermore, an example where the ratio converges to the least upper bound is found. We also find an example where the ratio is close to the least upper bound through a numerical experiment in case n is small.

7 Future work

We can see that the least upper bound of the ratio of the truncation error of pivoted QR to that of SVD is $O(2^k\sqrt{n})$. However, this upper bound is attained only on a contrived example [4]. We expect that the upper bound may become significantly smaller by adding some restrictions to the matrices. We intend to find such a property.

References

Björck, Å.: Numerical Methods for Least Squares Problems. SIAM, Philadelphia (1996)
Book Google Scholar
Businger, P., Golub, G.H.: Linear least squares solutions by householder transformations. Numer. Math. 7, 269–276 (1965)
Article MathSciNet Google Scholar
Chandrasekaran, S., Ipsen, I.C.F.: On rank-revealing factorisations. SIAM J. Matrix Anal. Appl. 15(2), 592–622 (1994)
Article MathSciNet Google Scholar
Drmač, Z., Gugercin, S.: A new selection operator for the discrete empirical interpolation method-improved a priori error bound and extensions. SIAM J. Sci. Comput. 38(2), A631–A648 (2016)
Article MathSciNet Google Scholar
Eldén, L.: Numerical linear algebra in data mining. Acta Numer. 15, 327–384 (2006)
Article MathSciNet Google Scholar
Faddeev, D.K., Kublanovskaya, V.N., Faddeeva, V.N.: Solution of linear algebraic systems with rectangular matrices. Trudy Mat. Inst. Steklov. 96, 76–92 (1968)
MATH Google Scholar
Golub, G.H.: Numerical methods for solving linear least squares problems. Numer. Math. 7, 206–216 (1965)
Article MathSciNet Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)
MATH Google Scholar
Hong, Y.P., Pan, C.-T.: Rank-revealing QR factorizations and the singular value decomposition. Math. Comput. 58, 213–232 (1992)
MathSciNet MATH Google Scholar
Kahan, W.: Numerical linear algebra. Can. Math. Bull. 9, 757–801 (1966)
Article Google Scholar
Kawamura, H.: Analysis of truncation error of matrix low rank approximation algorithm using QR decomposition with pivot selection. Trans. Jpn. Soc. Ind. Appl. Math. 30(2), 163–176 (2020)
Google Scholar
Khoromskij, B.N.: Tensors-structured numerical methods in scientific computing: survey on recent advances. Chemom. Intell. Lab. Syst. 110, 1–19 (2012)
Article Google Scholar
Kumar, N.K., Schneider, J.: Literature survey on low rank approximation of matrices. Linear Multilinear Algebra 65(11), 2212–2244 (2017)
Article MathSciNet Google Scholar
Ye, J.: Generalized low rank approximations of matrices. Mach. Learn. 61, 167–191 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
Haruka Kawamura & Reiji Suda

Authors

Haruka Kawamura
View author publications
You can also search for this author in PubMed Google Scholar
Reiji Suda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Reiji Suda.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.

About this article

Cite this article

Kawamura, H., Suda, R. Least upper bound of truncation error of low-rank matrix approximation algorithm using QR decomposition with pivoting. Japan J. Indust. Appl. Math. 38, 757–779 (2021). https://doi.org/10.1007/s13160-021-00459-x

Download citation

Received: 31 May 2020
Revised: 24 January 2021
Accepted: 01 February 2021
Published: 24 February 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s13160-021-00459-x

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Least upper bound of truncation error of low-rank matrix approximation algorithm using QR decomposition with pivoting

Abstract

Similar content being viewed by others

Single-pass randomized QLP decomposition for low-rank approximation

Efficient preconditioning for noisy separable nonnegative matrix factorization problems by successive projection based low-rank approximations

A Quadratically Convergent Algorithm for Structured Low-Rank Approximation

1 Introduction

1.1 Low-rank approximation

Theorem 1

1.2 Truncation error of pivoted QR

2 Preliminaries

Proposition 1

Definition 1

Definition 2

Example 1

Lemma 1

Proof

Lemma 2

Lemma 3

Lemma 4

Proof

Lemma 5

Proof

3 Evaluation from above

Lemma 6

Proof

Theorem 2

Proof

Theorem 3

Lemma 7

Proof

Proposition 2

Proof

4 Evaluation from below

Proposition 3

Proof

5 Numerical experiments

5.1 Experiments

Proposition 4

Conjecture 1

5.2 Environment

5.3 Results

6 Conclusion

Theorem 4

7 Future work

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation