Randomized Kaczmarz for tensor linear systems

Ma, Anna; Molitor, Denali

doi:10.1007/s10543-021-00877-w

Randomized Kaczmarz for tensor linear systems

Published: 17 May 2021

Volume 62, pages 171–194, (2022)
Cite this article

BIT Numerical Mathematics Aims and scope Submit manuscript

918 Accesses
12 Citations
Explore all metrics

Abstract

Solving linear systems of equations is a fundamental problem in mathematics. When the linear system is so large that it cannot be loaded into memory at once, iterative methods such as the randomized Kaczmarz method excel. Here, we extend the randomized Kaczmarz method to solve multi-linear (tensor) systems under the tensor–tensor t-product. We present convergence guarantees for tensor randomized Kaczmarz in two ways: using the classical matrix randomized Kaczmarz analysis and taking advantage of the tensor–tensor t-product structure. We demonstrate experimentally that the tensor randomized Kaczmarz method converges faster than traditional randomized Kaczmarz applied to a naively matricized version of the linear system. In addition, we draw connections between the proposed algorithm and a previously known extension of the randomized Kaczmarz algorithm for matrix linear systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tensor randomized extended Kaczmarz methods for large inconsistent tensor linear equations with t-product

Article 08 November 2023

Randomized Kaczmarz methods for tensor complementarity problems

Article 10 June 2022

On Polynomial Time Methods for Exact Low-Rank Tensor Completion

Article 07 January 2019

Notes

While the randomized Kaczmarz literature typically abbreviates randomized Kaczmarz as RK, throughout this work, MRK is used to distinguish the matrix and tensor versions of randomized Kaczmarz.
These linear systems are typically written as linear systems with block circulant matrices which are equivalent to the t-product as discussed in Definition 2.2.

References

Agmon, S.: The relaxation method for linear inequalities. Can. J. Math. 6, 382–392 (1954)
Article MathSciNet Google Scholar
Ahn, C.H., Jeong, B.S., Lee, S.Y.: Efficient hybrid finite element-boundary element method for 3-dimensional open-boundary field problems. IEEE Trans. Magn. 27, 4069–4072 (1991)
Article Google Scholar
Bone and joint ct-scan data. https://isbweb.org/data/vsj/
Censor, Y.: Row-action methods for huge and sparse systems and their applications. SIAM Rev. 23(4), 444–466 (1981)
Article MathSciNet Google Scholar
Czuprynski, K.D., Fahnline, J.B., Shontz, S.M.: Parallel boundary element solutions of block circulant linear systems for acoustic radiation problems with rotationally symmetric boundary surfaces. In: INTER-NOISE and NOISE-CON Congress and Conference Proceedings, vol. 2012, pp. 2812–2823. Institute of Noise Control Engineering (2012)
De Loera, J.A., Haddock, J., Needell, D.: A sampling Kaczmarz–Motzkin algorithm for linear feasibility. SIAM J. Sci. Comput. 39(5), S66–S87 (2017)
Article MathSciNet Google Scholar
Drineas, P., Mahoney, M.W.: Randnla: randomized numerical linear algebra. Commun. ACM 59(6), 80–90 (2016)
Article Google Scholar
Finding his Voice. Western Electric Company (1929). https://archive.org/details/FindingH1929
Elfving, T.: Block-iterative methods for consistent and inconsistent linear equations. Numer. Math. 35(1), 1–12 (1980)
Article MathSciNet Google Scholar
Gower, R.M., Richtárik, P.: Randomized iterative methods for linear systems. SIAM J. Matrix Anal. Appl. 36(4), 1660–1690 (2015)
Article MathSciNet Google Scholar
Haddock, J., Needell, D.: On Motzkin’s method for inconsistent linear systems. BIT 59(2), 387–401 (2019)
Article MathSciNet Google Scholar
Hao, N., Kilmer, M.E., Braman, K., Hoover, R.C.: Facial recognition using tensor–tensor decompositions. SIAM J. Imaging Sci. 6(1), 437–463 (2013)
Article MathSciNet Google Scholar
Kaczmarz, M.S.: Angenäherte auflösung von systemen linearer gleichungen. Bull. Acad. Polonaise Sci. Lett. 35, 355–357 (1937)
MATH Google Scholar
Kernfeld, E., Kilmer, M., Aeron, S.: Tensor–tensor products with invertible linear transforms. Linear Algebra Appl. 485, 545–570 (2015)
Article MathSciNet Google Scholar
Kilmer, M.E., Braman, K., Hao, N., Hoover, R.C.: Third-order tensors as operators on matrices: a theoretical and computational framework with applications in imaging. SIAM J. Matrix Anal. Appl. 34(1), 148–172 (2013)
Article MathSciNet Google Scholar
Kilmer, M.E., Martin, C.D.: Factorization strategies for third-order tensors. Linear Algebra Appl. 435(3), 641–658 (2011)
Article MathSciNet Google Scholar
Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)
Article Google Scholar
Liu, Z., Zhao, H.V., Elezzabi, A.Y.: Block-based adaptive compressed sensing for video. In: IEEE Image Proc., pp. 1649–1652. IEEE (2010)
Lund, K.: The tensor t-function: a definition for functions of third-order tensors. Numer. Linear Algebra Appl. 27(3), e2288 (2020)
Article MathSciNet Google Scholar
Ma, A., Needell, D., Ramdas, A.: Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods. SIAM J. Matrix Anal. A 36(4), 1590–1604 (2015)
Article MathSciNet Google Scholar
Majumdar, A., Ward, R.K.: Face recognition from video: an MMV recovery approach. In: Int. Conf. Acoust. Spee, pp. 2221–2224. IEEE (2012)
Miao, Y., Qi, L., Wei, Y.: Generalized tensor function via the tensor singular value decomposition based on the t-product. Linear Algebra Its Appl. 590, 258–303 (2020)
Article MathSciNet Google Scholar
Motzkin, T.S., Schoenberg, I.J.: The relaxation method for linear inequalities. Can. J. Math 6, 393–404 (1954)
Article MathSciNet Google Scholar
Needell, D.: Randomized Kaczmarz solver for noisy linear systems. BIT 50(2), 395–403 (2010)
Article MathSciNet Google Scholar
Needell, D., Tropp, J.A.: Paved with good intentions: analysis of a randomized block Kaczmarz method. Linear Algebra Appl. 441, 199–221 (2014)
Article MathSciNet Google Scholar
Needell, D., Ward, R., Srebro, N.: Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm. Adv. Neural Inf. Process. Syst. 27, 1017–1025 (2014)
MATH Google Scholar
Newman, E., Horesh, L., Avron, H., Kilmer, M.: Stable tensor neural networks for rapid deep learning. arXiv preprint arXiv:1811.06569 (2018)
Nutini, J., Sepehry, B., Virani, A., Laradji, I., Schmidt, M., Koepke, H.: Convergence rates for greedy Kaczmarz algorithms. In: UAI (2016)
Petra, S., Popa, C.: Single projection Kaczmarz extended algorithms. Numer. Algorithms 73(3), 791–806 (2016)
Article MathSciNet Google Scholar
Richtárik, P., Takác, M.: Stochastic reformulations of linear systems: algorithms and convergence theory. SIAM J. Matrix Anal. Appl. 41(2), 487–524 (2020)
Article MathSciNet Google Scholar
Semerci, O., Hao, N., Kilmer, M.E., Miller, E.L.: Tensor-based formulation and nuclear norm regularization for multienergy computed tomography. IEEE Trans. Image Process. 23(4), 1678–1693 (2014)
Article MathSciNet Google Scholar
Soltani, S., Kilmer, M.E., Hansen, P.C.: A tensor-based dictionary learning approach to tomographic image reconstruction. BIT 56(4), 1425–1454 (2016)
Article MathSciNet Google Scholar
Song, G., Ng, M.K., Zhang, X.: Robust tensor completion using transformed tensor singular value decomposition. Numer. Linear Algebra Appl. 27(3), e2299 (2020). https://doi.org/10.1002/nla.2299
Article MathSciNet MATH Google Scholar
Strohmer, T., Vershynin, R.: A randomized Kaczmarz algorithm with exponential convergence. J. Fourier Anal. Appl. 15(2), 262 (2009)
Article MathSciNet Google Scholar
Vescovo, R.: Electromagnetic scattering from cylindrical arrays of infinitely long thin wires. Electron. Lett. 31(19), 1646–1647 (1995)
Article Google Scholar
Wang, X., Che, M., Wei, Y.: Tensor neural network models for tensor singular value decompositions. Comput. Optim. Appl. 75(3), 753–777 (2020)
Article MathSciNet Google Scholar
Zhang, Z., Aeron, S.: Denoising and completion of 3d data via multidimensional dictionary learning. In: Int. Join. Conf. Artif., pp. 2371–2377 (2016)
Zhang, Z., Aeron, S.: Exact tensor completion using t-SVD. IEEE Trans. Signal Process. 65(6), 1511–1526 (2017)
Article MathSciNet Google Scholar
Zhang, Z., Ely, G., Aeron, S., Hao, N., Kilmer, M.: Novel methods for multilinear data completion and de-noising based on tensor-SVD. In: CVPR, pp. 3842–3849. IEEE (2014)
Zhou, P., Lu, C., Lin, Z., Zhang, C.: Tensor factorization for low-rank tensor completion. IEEE Trans. Image Process. 27(3), 1152–1163 (2018)
Article MathSciNet Google Scholar
Zouzias, A., Freris, N.M.: Randomized extended Kaczmarz for solving least squares. SIAM J. Matrix Anal. Appl. 34(2), 773–793 (2013)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

University of California, Irvine, Irvine, USA
Anna Ma
University of California, Los Angeles, Los Angeles, USA
Denali Molitor

Authors

Anna Ma
View author publications
You can also search for this author in PubMed Google Scholar
Denali Molitor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna Ma.

Additional information

Communicated by Daniel Kressner.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work began at the 2019 workshop for Women in Science of Data and Math (WISDM) held at the Institute for Computational and Experimental Research in Mathematics (ICERM). This workshop is partially supported by an NSF ADVANCE grant (award #1500481) to the Association for Women in Mathematics (AWM). Ma was partially supported by U.S. Air Force Award FA9550-18-1-0031 led by Roman Vershynin. Molitor is grateful to and was partially supported by NSF CAREER DMS $\#1348721$ and NSF BIGDATA DMS $\#1740325$ led by Deanna Needell. The authors would also like to thank Misha Kilmer for her advising during the WISDM workshop and valuable feedback that improved earlier versions of this manuscript.

Appendices

Proof of Fact 1

The following properties of block circulant matrices will be useful in proving Fact 1.

Fact 2

(Lemma 1.iii [19]) For tensors ${{\mathscr {A}}}$ and ${{\mathscr {B}}}$, $\text {bcirc}\left( {{\mathscr {A}}}{{\mathscr {B}}}\right) = \text {bcirc}\left( {{\mathscr {A}}}\right) \text {bcirc}\left( {{\mathscr {B}}}\right) $.

Fact 3

(Theorem 6.ii [19]) The block circulant operator $\text {bcirc}\left( \cdot \right) $ commutes with the conjugate transpose,

$$\begin{aligned} \text {bcirc}\left( {{\mathscr {M}}}^*\right) = \text {bcirc}\left( {{\mathscr {M}}}\right) ^*. \end{aligned}$$

1. Part 1 of Fact 1 states

$$\begin{aligned} \text {bdiag}\left( \widehat{{{\mathscr {A}}}{{\mathscr {B}}}}\right) = \text {bdiag}\left( {\widehat{{{\mathscr {A}}}}}\right) \text {bdiag}\left( {\widehat{{{\mathscr {B}}}}}\right) . \end{aligned}$$

Proof

Let ${{\mathscr {A}}}\in {\mathbb {C}}^{m\times \ell \times n}$ and ${{\mathscr {B}}}\in {\mathbb {C}}^{\ell \times p\times n}$, then

$$\begin{aligned} \text {bdiag}\left( \widehat{{{\mathscr {A}}}{{\mathscr {B}}}}\right) \overset{(4.1)}{=}&\left( \mathbf{F}_n\otimes \mathbf{I}_m\right) \text {bcirc}\left( {{{\mathscr {A}}}}{{\mathscr {B}}}\right) \left( \mathbf{F}_n^*\otimes \mathbf{I}_p\right) \\ \overset{Fact 2}{=}&\left( \mathbf{F}_n\otimes \mathbf{I}_m\right) \text {bcirc}\left( {{{\mathscr {A}}}}\right) \text {bcirc}\left( {{\mathscr {B}}}\right) \left( \mathbf{F}_n^*\otimes \mathbf{I}_p\right) \\ =&\left( \mathbf{F}_n\otimes \mathbf{I}_m\right) \text {bcirc}\left( {{{\mathscr {A}}}}\right) \left( \mathbf{F}_{n}^* \otimes \mathbf{I}_{\ell }\right) \left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) \text {bcirc}\left( {{\mathscr {B}}}\right) \left( \mathbf{F}_n^*\otimes \mathbf{I}_p\right) \\ \overset{(4.1)}{=}&\text {bdiag}\left( \widehat{{{\mathscr {A}}}}\right) \text {bdiag}\left( \widehat{{{\mathscr {B}}}}\right) . \end{aligned}$$

$\square $

2. Part 2 of Fact 1 states

$$\begin{aligned} \widehat{{{\mathscr {A}}}+ {{\mathscr {B}}}} = {\widehat{{{\mathscr {A}}}}} + {\widehat{{{\mathscr {B}}}}}. \end{aligned}$$

Proof

Let ${{\mathscr {A}}}\in {\mathbb {C}}^{m\times \ell \times n}$ and ${{\mathscr {B}}}\in {\mathbb {C}}^{m \times \ell \times n}$, then

$$\begin{aligned} \text {bdiag}\left( \widehat{{{\mathscr {A}}}+ {{\mathscr {B}}}}\right) \overset{(4.1)}{=}&\left( \mathbf{F}_n\otimes \mathbf{I}_m\right) \text {bcirc}\left( {{{\mathscr {A}}}} + {{\mathscr {B}}}\right) \left( \mathbf{F}_n^*\otimes \mathbf{I}_{\ell }\right) \\ =&\left( \mathbf{F}_n\otimes \mathbf{I}_m\right) \left( \text {bcirc}\left( {{\mathscr {A}}}\right) + \text {bcirc}\left( {{\mathscr {B}}}\right) \right) \left( \mathbf{F}_n^*\otimes \mathbf{I}_{\ell }\right) \\ =&\left( \mathbf{F}_n\otimes \mathbf{I}_m\right) \text {bcirc}\left( {{{\mathscr {A}}}}\right) \left( \mathbf{F}_{n}^* \otimes \mathbf{I}_{\ell }\right) + \left( \mathbf{F}_{n} \otimes \mathbf{I}_{m}\right) \text {bcirc}\left( {{\mathscr {B}}}\right) \left( \mathbf{F}_n^*\otimes \mathbf{I}_{\ell }\right) \\ \overset{(4.1)}{=}&\text {bdiag}\left( {\widehat{{{\mathscr {A}}}}}\right) + \text {bdiag}\left( {\widehat{{{\mathscr {B}}}}}\right) . \end{aligned}$$

$\square $

3. Part 3 of Fact 1 states that

$$\begin{aligned} \text {bdiag}\left( \widehat{{{\mathscr {M}}}^*}\right) = \text {bdiag}\left( {\widehat{{{\mathscr {M}}}}}\right) ^*. \end{aligned}$$

Additionally, it states that if $\text {bcirc}\left( {{\mathscr {M}}}\right) $ is symmetric, $\text {bdiag}\left( {\widehat{{{\mathscr {M}}}}}\right) $ is also symmetric.

Proof

Let ${{\mathscr {M}}}\in {\mathbb {C}}^{m \times \ell \times n}$. Then

$$\begin{aligned} \text {bdiag}\left( \widehat{{{\mathscr {M}}}^*}\right) \overset{(4.1)}{=}&\left( \mathbf{F}_n \otimes \mathbf{I}_{\ell } \right) \text {bcirc}\left( {{\mathscr {M}}}^*\right) \left( \mathbf{F}_n^* \otimes \mathbf{I}_{m} \right) \\ \overset{Fact 3}{=}&\left( \mathbf{F}_n \otimes \mathbf{I}_{\ell } \right) \text {bcirc}\left( {{\mathscr {M}}}\right) ^* \left( \mathbf{F}_n^* \otimes \mathbf{I}_{m} \right) \\ =&\left[ \left( \mathbf{F}_n \otimes \mathbf{I}_{m} \right) \text {bcirc}\left( {{\mathscr {M}}}\right) \left( \mathbf{F}_n^* \otimes \mathbf{I}_{\ell } \right) \right] ^*\\ \overset{(4.1)}{=}&\text {bdiag}\left( {\widehat{{{\mathscr {M}}}}}\right) ^*. \end{aligned}$$

To see that $\text {bdiag}\left( {\widehat{{{\mathscr {M}}}}}\right) $ is also symmetric when $\text {bcirc}\left( {{\mathscr {M}}}\right) $ is symmetric, note that

$$\begin{aligned} \text {bdiag}\left( {\widehat{{{\mathscr {M}}}}}\right) ^* \overset{(4.1)}{=} \left[ \left( \mathbf{F}_n \otimes \mathbf{I}_m \right) \text {bcirc}\left( {{\mathscr {M}}}\right) \left( \mathbf{F}_n^* \otimes \mathbf{I}_n \right) \right] ^*. \end{aligned}$$

$\square $

4. Finally, part 4 of Fact 1 states

$$\begin{aligned} \text {bdiag}\left( \widehat{{{\mathscr {M}}}^{-1}}\right) =\text {bdiag}\left( {\widehat{{{\mathscr {M}}}}}\right) ^{-1}. \end{aligned}$$

Proof

Let ${{\mathscr {M}}}\in {\mathbb {C}}^{m\times m \times n}$. Note that $\text {bcirc}\left( {{\mathscr {I}}}_m\right) = \mathbf{I}_{mn}$. Using Fact 2 and Eq. (4.1),

$$\begin{aligned} \text {bdiag}\left( \widehat{{{\mathscr {M}}}^{-1}}\right)&\text {bdiag}\left( {\widehat{{{\mathscr {M}}}}}\right) \\ \overset{(4.1)}{=}&\left( \mathbf{F}_n \otimes \mathbf{I}_m \right) \text {bcirc}\left( {{\mathscr {M}}}^{-1}\right) \left( \mathbf{F}_n^* \otimes \mathbf{I}_m \right) \left( \mathbf{F}_n \otimes \mathbf{I}_m \right) \text {bcirc}\left( {{\mathscr {M}}}\right) \left( \mathbf{F}_n^* \otimes \mathbf{I}_m \right) \\ =&\left( \mathbf{F}_n \otimes \mathbf{I}_m \right) \text {bcirc}\left( {{\mathscr {M}}}^{-1}\right) \text {bcirc}\left( {{\mathscr {M}}}\right) \left( \mathbf{F}_n^* \otimes \mathbf{I}_m \right) \\ \overset{Fact 2}{=}&\left( \mathbf{F}_n \otimes \mathbf{I}_m \right) \text {bcirc}\left( {{\mathscr {I}}}_m\right) \left( \mathbf{F}_n^* \otimes \mathbf{I}_m \right) \\ =&\mathbf{I}_{mn}. \end{aligned}$$

Analogously, one can show $\text {bdiag}\left( {\widehat{{{\mathscr {M}}}}}\right) \text {bdiag}\left( \widehat{{{\mathscr {M}}}^{-1}}\right) = \mathbf{I}_{mn}$. $\square $

Proof of Theorem 4.1

We now prove Theorem 4.1.

Proof

Let $\widehat{{{\mathscr {P}}}_i}$ be the tensor formed by applying FFTs to each tube fiber of ${{\mathscr {P}}}_i = {{\mathscr {A}}}_{i::}^* \left( {{\mathscr {A}}}_{i::}{{\mathscr {A}}}_{i::}^*\right) ^{-1}{{\mathscr {A}}}_{i::}$ and ${{\mathscr {E}}}^t = {{\mathscr {X}}}^t - {{\mathscr {X}}}^*$. By Eq. (4.1), we have that

$$\begin{aligned} \text {bdiag}\left( \widehat{{{\mathscr {P}}}_i}\right) = \left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) {\text {bcirc}\left( {{\mathscr {P}}}_i\right) }\left( \mathbf{F}_{n}^* \otimes \mathbf{I}_{\ell }\right) , \end{aligned}$$

is a block diagonal matrix with blocks $\left( \widehat{\mathbf{P}_i}\right) _k$, where $\left( \widehat{\mathbf{P}_i}\right) _k$ is the kth frontal slice of the tensor $\widehat{{{\mathscr {P}}}_i}$. We note that the projected error can be rewritten as

$$\begin{aligned}&{\mathbb {E}}\left[ \left\Vert {{\mathscr {P}}}_i{{\mathscr {E}}}^t\right\Vert _F^2\right] = \sum _{j=1}^{p}\langle {\mathbb {E}}\left[ \text {bcirc}\left( {{\mathscr {P}}}_i\right) \right] \text {unfold}\left( {{\mathscr {E}}}^t\right) _{:j},\text {unfold}\left( {{\mathscr {E}}}^t\right) _{:j}\rangle \\&\quad = \sum _{j=1}^{p}{\mathbb {E}}\left[ \langle \left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) {\text {bcirc}\left( {{\mathscr {P}}}_i\right) }\left( \mathbf{F}_{n}^* \otimes \mathbf{I}_{\ell }\right) \left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) \text {unfold}\left( {{\mathscr {E}}}^t\right) _{:j}, \left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) \text {unfold}\left( {{\mathscr {E}}}^t\right) _{:j}\rangle \right] \\&\quad = \sum _{j=1}^{p}{\mathbb {E}}\left[ \langle \text {bdiag}\left( \widehat{{{\mathscr {P}}}_i}\right) \left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) \text {unfold}\left( {{\mathscr {E}}}^t\right) _{:j}, \left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) \text {unfold}\left( {{\mathscr {E}}}^t\right) _{:j}\rangle \right] . \end{aligned}$$

Note that the rows of $\text {bcirc}\left( {{\mathscr {A}}}_{i::}\right) $ are also rows of $\text {bcirc}\left( {{\mathscr {A}}}\right) $. Thus, ${{\mathscr {X}}}^{t+1} \in \text {rowsp}\left( {{\mathscr {A}}}\right) $. Since ${{\mathscr {X}}}^*$ is the tensor of least Frobenius norm, ${{\mathscr {X}}}^*\in \text {rowsp}\left( {{\mathscr {A}}}\right) $. Therefore ${{\mathscr {E}}}^ = {{\mathscr {X}}}^t - {{\mathscr {X}}}^* \in \text {rowsp}\left( {{\mathscr {A}}}\right) $ as long as ${{\mathscr {X}}}^0\in \text {rowsp}\left( {{\mathscr {A}}}\right) $.

Now, since ${\mathbb {E}}\left[ \text {bdiag}\left( \widehat{{{\mathscr {P}}}_i}\right) \right] $ is symmetric and ${{\mathscr {E}}}^t\in \text {rowsp}\left( {{\mathscr {A}}}\right) $, by Fact 1,

$$\begin{aligned} {\mathbb {E}}\left[ \left\Vert {{\mathscr {P}}}_i{{\mathscr {E}}}^t\right\Vert _F^2\right] \ge \sigma _{\min }^+\left( {\mathbb {E}}\left[ \text {bdiag}\left( \widehat{{{\mathscr {P}}}_i}\right) \right] \right) \left\Vert \left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) \text {unfold}\left( {{\mathscr {E}}}^t\right) \right\Vert _F^2. \end{aligned}$$

(B.1)

Note that,

$$\begin{aligned} \left\Vert \left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) \text {unfold}\left( {{\mathscr {E}}}^t\right) \right\Vert _F^2&= \sum _{j=1}^{p} \langle \left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) \text {unfold}\left( {{\mathscr {E}}}^t\right) _{:j},\left( \mathbf{F}_{n} \otimes \mathbf{I}_{\ell }\right) \text {unfold}\left( {{\mathscr {E}}}^t\right) _{:j}\rangle \\&= \sum _{j=1}^{p} \langle \text {unfold}\left( {{\mathscr {E}}}^t\right) _{:j},\text {unfold}\left( {{\mathscr {E}}}^t\right) _{:j}\rangle \\&= \left\Vert \text {unfold}\left( {{\mathscr {E}}}^t\right) \right\Vert _F^2\\&= \left\Vert {{\mathscr {E}}}^t\right\Vert _F^2. \end{aligned}$$

Since $\text {bdiag}\left( \widehat{{{\mathscr {P}}}_i}\right) $ is block diagonal,

$$\begin{aligned} \sigma _{\min }^+\left( {\mathbb {E}}\left[ \text {bdiag}\left( \widehat{{{\mathscr {P}}}_i}\right) \right] \right) = \min _{k \in [n-1]} \sigma _{\min }^+\left( {\mathbb {E}}\left[ \left( \widehat{\mathbf{P}_i}\right) _k\right] \right) . \end{aligned}$$

Factoring $\text {bdiag}\left( \widehat{{{\mathscr {P}}}_i}\right) $ and using Fact 1,

Noting that $\text {bdiag}\left( \widehat{{{{\mathscr {A}}}}_{i::}}\right) \text {bdiag}\left( \widehat{{{{\mathscr {A}}}}_{i::}^*}\right) $ is a diagonal matrix, one can see that $\left( \widehat{\mathbf{P}_i}\right) _k$ is the projection onto $\left( \widehat{{{{\mathscr {A}}}}_{i::}}\right) _k$ by rewriting the kth frontal face of $\widehat{{{\mathscr {P}}}_i}$ as

$$\begin{aligned} \left( \widehat{\mathbf{P}_i}\right) _k = \frac{\left( \widehat{{{{\mathscr {A}}}}_{i::}}\right) ^*_k\left( \widehat{ {{{\mathscr {A}}}}_{i::}}\right) _k}{\left( \widehat{{{{\mathscr {A}}}}_{i::}}\widehat{{{{\mathscr {A}}}}_{i::}^*}\right) _k}. \end{aligned}$$

We can thus rewrite Eq. (B.1) as

$$\begin{aligned} {\mathbb {E}}\left[ \left\Vert {{\mathscr {P}}}_i{{\mathscr {E}}}^t\right\Vert _F^2\right] \ge \min _{k\in [n-1]} \sigma _{\min }^+\left( {\mathbb {E}}\left[ \frac{\left( \widehat{{{{\mathscr {A}}}}_{i::}}\right) ^*_k\left( \widehat{ {{{\mathscr {A}}}}_{i::}}\right) _k}{\left( \widehat{{{{\mathscr {A}}}}_{i::}}\widehat{{{{\mathscr {A}}}}_{i::}^*}\right) _k}\right] \right) \left\Vert {{\mathscr {E}}}^t\right\Vert _F^2. \end{aligned}$$

(B.2)

The expectation of Eq. (B.1) can now be calculated explicitly. For simplicity, we assume that the row indices i are sampled uniformly. As in MRK extensions and literature, many other sampling distributions could be used.

To derive a lower bound for the smallest singular value in Eq. (B.2), define

$$\begin{aligned} \left\Vert {\widehat{\mathbf{A}}}_k\right\Vert _{\infty ,2}^2 {:=} \max _{i}\left[ \left( \widehat{{{{\mathscr {A}}}}_{i::}}\widehat{{{{\mathscr {A}}}}_{i::}^*}\right) _k\right] . \end{aligned}$$

(B.3)

The values $\left( \widehat{{{{\mathscr {A}}}}_{i::}}\widehat{{{{\mathscr {A}}}}_{i::}^*}\right) _k$ are necessarily positive for all $k \in [n-1]$ when ${{\mathscr {A}}}_{i::}{{\mathscr {A}}}_{i::}^*$ is invertible for all $i \in [m-1]$. as

$$\begin{aligned} \left( \widehat{{{{\mathscr {A}}}}_{i::}}\widehat{{{{\mathscr {A}}}}_{i::}^*}\right) _k&= \text {bdiag}\left( \widehat{{{\mathscr {A}}}_{i::}}\widehat{{{\mathscr {A}}}_{i::}^*}\right) _{kk} \\&= \text {bdiag}\left( \widehat{{{\mathscr {A}}}_{i::}}\right) _k \text {bdiag}\left( \widehat{{{\mathscr {A}}}_{i::}^*}\right) _k\\&= \left( \mathbf{F}_n\right) _{k:} \text {bcirc}\left( {{\mathscr {A}}}_{i::}\right) \text {bcirc}\left( {{\mathscr {A}}}_{i::}^*\right) \left( \mathbf{F}_n\right) _{k:}^* \\&= \left( \mathbf{F}_n\right) _{k:} \text {bcirc}\left( {{\mathscr {A}}}_{i::}\right) \text {bcirc}\left( {{\mathscr {A}}}_{i::}\right) ^* \left( \mathbf{F}_n\right) _{k:}^*\\&= \left\Vert \text {bcirc}\left( {{\mathscr {A}}}_{i::}\right) ^* \left( \mathbf{F}_n\right) _{k:}^*\right\Vert _2^2. \end{aligned}$$

Now, it can be easily verified that

$$\begin{aligned} \sigma _{\min }^+\left( {\mathbb {E}}\left[ \frac{\left( \widehat{{{{\mathscr {A}}}}_{i::}}\right) _k^*\left( \widehat{ {{{\mathscr {A}}}}_{i::}}\right) _k}{\left( \widehat{{{{\mathscr {A}}}}_{i::}}\widehat{{{{\mathscr {A}}}}_{i::}^*}\right) _k}\right] \right)&\ge \sigma _{\min }^+\left( \frac{1}{m} \sum _{i=0}^{m-1} \frac{\left( \widehat{{{{\mathscr {A}}}}_{i::}}\right) _k^*\left( \widehat{ {{{\mathscr {A}}}}_{i::}}\right) _k}{\left\Vert {\widehat{\mathbf{A}}}_k\right\Vert _{\infty ,2}^2}\right) \\&= \frac{\left[ \sigma ^+_{\min }\left( {\widehat{\mathbf{A}}}_k \right) \right] ^2}{m\left\Vert {\widehat{\mathbf{A}}}_k\right\Vert _{\infty ,2}^2} . \end{aligned}$$

The projected error of Eq. (B.2) then becomes

$$\begin{aligned} {\mathbb {E}}\left[ \left\Vert {{\mathscr {P}}}_i{{\mathscr {E}}}^t\right\Vert _F^2\right] \ge \min _{k\in [n-1]} \frac{\left[ \sigma ^+_{\min }\left( {\widehat{\mathbf{A}}}_k \right) \right] ^2}{m\left\Vert {\widehat{\mathbf{A}}}_k\right\Vert _{\infty ,2}^2} \left\Vert {{\mathscr {E}}}^t\right\Vert _F^2., \end{aligned}$$

We can thus rewrite the guarantee in Theorem 4.1 for uniform random sampling of the row indices i as

$$\begin{aligned} {\mathbb {E}}\left[ \left\Vert {{\mathscr {X}}}^{t+1}-{{\mathscr {X}}}^*\right\Vert _F^2\bigg | {{\mathscr {X}}}^{0}\right] \le \left( 1-\min _{k\in [n-1]} \frac{\left[ \sigma ^+_{\min }\left( {\widehat{\mathbf{A}}}_k \right) \right] ^2}{m\left\Vert {\widehat{\mathbf{A}}}_k\right\Vert _{\infty ,2}^2}\right) ^{t+1} \left\Vert {{\mathscr {X}}}^{0}-{{\mathscr {X}}}^*\right\Vert _F^2. \end{aligned}$$

$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, A., Molitor, D. Randomized Kaczmarz for tensor linear systems. Bit Numer Math 62, 171–194 (2022). https://doi.org/10.1007/s10543-021-00877-w

Download citation

Received: 10 June 2020
Accepted: 22 April 2021
Published: 17 May 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10543-021-00877-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Randomized Kaczmarz for tensor linear systems

Abstract

Access this article

Similar content being viewed by others

Tensor randomized extended Kaczmarz methods for large inconsistent tensor linear equations with t-product

Randomized Kaczmarz methods for tensor complementarity problems

On Polynomial Time Methods for Exact Low-Rank Tensor Completion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Proof of Fact 1

Fact 2

Fact 3

Proof

Proof

Proof

Proof

Proof of Theorem 4.1

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Randomized Kaczmarz for tensor linear systems

Abstract

Access this article

Similar content being viewed by others

Tensor randomized extended Kaczmarz methods for large inconsistent tensor linear equations with t-product

Randomized Kaczmarz methods for tensor complementarity problems

On Polynomial Time Methods for Exact Low-Rank Tensor Completion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Proof of Fact 1

Fact 2

Fact 3

Proof

Proof

Proof

Proof

Proof of Theorem 4.1

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation