Randomized numerical linear algebra: Foundations and algorithms

Per-Gunnar Martinsson; Joel A. Tropp

doi:10.1017/S0962492920000021

Randomized numerical linear algebra: Foundations and algorithms

Published online by Cambridge University Press: 30 November 2020

Per-Gunnar Martinsson and

Joel A. Tropp

Show author details

Per-Gunnar Martinsson: Affiliation:
Department of Mathematics, University of Texas at AustinAustin, TX78712, USA E-mail: pgm@oden.utexas.edu
Joel A. Tropp: Affiliation:
Computing & Mathematical Sciences, California Institute of TechnologyPasadena, CA91125, USA E-mail: jtropp@cms.caltech.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This survey describes probabilistic algorithms for linear algebraic computations, such as factorizing matrices and solving linear systems. It focuses on techniques that have a proven track record for real-world problems. The paper treats both the theoretical foundations of the subject and practical computational issues.

Topics include norm estimation, matrix approximation by sampling, structured and unstructured random embeddings, linear regression problems, low-rank approximation, subspace iteration and Krylov methods, error estimation and adaptivity, interpolatory and CUR factorizations, Nyström approximation of positive semidefinite matrices, single-view (‘streaming’) algorithms, full rank-revealing factorizations, solvers for linear systems, and approximation of kernel matrices that arise in machine learning and in scientific computing.

Type: Research Article
Information: Acta Numerica , Volume 29 , May 2020 , pp. 403 - 572

DOI: https://doi.org/10.1017/S0962492920000021 [Opens in a new window]
Copyright: © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

References 2

Achlioptas, D. (2003), ‘Database-friendly random projections: Johnson–Lindenstrauss with binary coins’, J. Comput. System Sci. 66, 671–687.Google Scholar

Achlioptas, D. and McSherry, F. (2001), Fast computation of low rank matrix approximations. In 33rd Annual ACM Symposium on Theory of Computing, ACM, pp. 611–618.Google Scholar

Achlioptas, D. and McSherry, F. (2007), ‘Fast computation of low-rank matrix approximations’, J. Assoc. Comput. Mach. 54, 9.CrossRef Google Scholar

Ahlswede, R. and Winter, A. (2002), ‘Strong converse for identification via quantum channels’, IEEE Trans. Inform. Theory 48, 569–579.CrossRef Google Scholar

Ailon, N. and Chazelle, B. (2006), Approximate nearest neighbors and the fast Johnson–Lindenstrauss transform. In 38th Annual ACM Symposium on Theory of Computing, ACM, pp. 557–563.Google Scholar

Ailon, N. and Chazelle, B. (2009), ‘The fast Johnson–Lindenstrauss transform and approximate nearest neighbors’, SIAM J. Comput. 39, 302–322.Google Scholar

Alaoui, A. and Mahoney, M. W. (2015), Fast randomized kernel ridge regression with statistical guarantees. In Advances in Neural Information Processing Systems 28 (Cortes, C. et al., eds), Curran Associates, pp. 775–783.Google Scholar

Alon, N., Gibbons, P. B., Matias, Y. and Szegedy, M. (2002), ‘Tracking join and self-join sizes in limited storage’, J. Comput. System Sci. 64, 719–747.CrossRef Google Scholar

Alon, N., Matias, Y. and Szegedy, M. (1999), ‘The space complexity of approximating the frequency moments’, J. Comput. System Sci. 58, 137–147.Google Scholar

Ambikasaran, S., Foreman-Mackey, D., Greengard, L., Hogg, D. W. and O’Neil, M. (2015), ‘Fast direct methods for Gaussian processes’, IEEE Trans. Pattern Anal. Machine Intel. 38, 252–265.CrossRef Google Scholar

Amelunxen, D., Lotz, M., McCoy, M. B. and Tropp, J. A. (2014), ‘Living on the edge: Phase transitions in convex programs with random data’, Inf. Inference 3, 224–294.CrossRef Google Scholar

Ando, T. (2005), Schur complements and matrix inequalities: Operator-theoretic approach. In The Schur Complement and its Applications (Zhang, F., ed.), Vol. 4 of Numerical Methods and Algorithms, Springer, pp. 137–162.CrossRef Google Scholar

Arcones, M. A. and Gine, E. (1992), ‘On the bootstrap of

$u$ and

$v$ statistics’, Ann. Statist. 20, 655–674.Google Scholar

Arras, B., Bachmayr, M. and Cohen, A. (2019), ‘Sequential sampling for optimal weighted least squares approximations in hierarchical spaces’, SIAM J. Math. Data Sci. 1, 189–207.CrossRef Google Scholar

Avron, H. (2018), Randomized Riemannian preconditioning for quadratically constrained problems. Slides, Workshop on Randomized Numerical Linear Algebra, Simons Institute, UC Berkeley.Google Scholar

Avron, H., Clarkson, K. and Woodruff, D. (2017), ‘Faster kernel ridge regression using sketching and preconditioning’, SIAM J. Matrix Anal. Appl. 38, 1116–1138.CrossRef Google Scholar

Avron, H., Druinsky, A. and Gupta, A. (2015), ‘Revisiting asynchronous linear solvers: Provable convergence rate through randomization’, J. Assoc. Comput. Mach. 62, 51.CrossRef Google Scholar

Avron, H., Kapralov, M., Musco, C., Musco, C., Velingker, A. and Zandieh, A. (2019), A universal sampling method for reconstructing signals with simple Fourier transforms. In 51st Annual ACM SIGACT Symposium on Theory of Computing (STOC 2019), ACM, pp. 1051–1063.CrossRef Google Scholar

Avron, H., Maymounkov, P. and Toledo, S. (2010), ‘Blendenpik: Supercharging LAPACK’s least-squares solver’, SIAM J. Sci. Comput. 32, 1217–1236.CrossRef Google Scholar

Baboulin, M., Dongarra, J. J., Rémy, A., Tomov, S. and Yamazaki, I. (2017), ‘Solving dense symmetric indefinite systems using GPUs’, Concurr. Comput. Pract. Exp. 29, e4055.CrossRef Google Scholar

Baboulin, M., Li, X. S. and Rouet, F.-H. (2014), Using random butterfly transformations to avoid pivoting in sparse direct methods. In International Conference on High Performance Computing for Computational Science (VECPAR 2014), Vol. 8969 of Lecture Notes in Computer Science, Springer, pp. 135–144.Google Scholar

Bach, F. (2013), Sharp analysis of low-rank kernel matrix approximations. In 26th Annual Conference on Learning Theory, Vol. 30 of JMLR Workshop and Conference Proceedings, PMLR, pp. 185–209.Google Scholar

Bach, F. (2017), ‘On the equivalence between kernel quadrature rules and random feature expansions’, J. Mach. Learn. Res. 18, 1–38.Google Scholar

Bach, F. R. and Jordan, M. (2005), Predictive low-rank decomposition for kernel methods. In 22nd International Conference on Machine Learning (ICML ’05), ACM, pp. 33–40.Google Scholar

Bai, Z. and Silverstein, J. W. (2010), Spectral Analysis of Large Dimensional Random Matrices, second edition, Springer Series in Statistics, Springer.CrossRef Google Scholar

Bai, Z., Demmel, J., Dongarra, J., Ruhe, A. and van der Vorst, H. (1987), Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide (Software, Environments and Tools), first edition, SIAM.Google Scholar

Baldi, P. and Vershynin, R. (2019), ‘Polynomial threshold functions, hyperplane arrangements, and random tensors’, SIAM J. Math. Data Sci. 1, 699–729.CrossRef Google Scholar

Ballard, G., Demmel, J., Dumitriu, I. and Rusciano, A. (2019), A generalized randomized rank-revealing factorization. arXiv:1909.06524 Google Scholar

Banks, J., Vargas, J. G., Kulkarni, A. and Srivastava, N. (2019), Pseudospectral shattering, the sign function, and diagonalization in nearly matrix multiplication time. arXiv:1912.08805 Google Scholar

Barnes, J. and Hut, P. (1986), ‘A hierarchical

$O(NlogN)$ force-calculation algorithm’, Nature 324, 446–449.CrossRef Google Scholar

Bartlett, P. (2013), U-statistics. Berkeley Statistics 210B Lecture Notes.Google Scholar

Batson, J., Spielman, D. A. and Srivastava, N. (2014), ‘Twice-Ramanujan sparsifiers’, SIAM Rev. 56, 315–334.CrossRef Google Scholar

Bebendorf, M. and Grzhibovskis, R. (2006), ‘Accelerating Galerkin BEM for linear elasticity using adaptive cross approximation’, Math. Methods Appl. Sci. 29, 1721–1747.CrossRef Google Scholar

Bhatia, R. (1997), Matrix Analysis, Vol. 169 of Graduate Texts in Mathematics, Springer.CrossRef Google Scholar

Bischof, C. H. and Quintana-Ortí, G. (1998), ‘Computing rank-revealing QR factorizations of dense matrices’, ACM Trans. Math. Software 24, 226–253.CrossRef Google Scholar

Bochner, S. (1933), ‘Monotone Funktionen, Stieltjessche Integrale und harmonische Analyse’, Math. Ann. 108, 378–410.CrossRef Google Scholar

Bottou, L. (2010), Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT ’2010 (Lechevallier, Y. and Saporta, G., eds), Physica-Verlag HD, pp. 177–186.CrossRef Google Scholar

Boucheron, S., Lugosi, G. and Massart, P. (2013), Concentration Inequalities: A Nonasymptotic Theory of Independence, Oxford University Press.Google Scholar

Bourgain, J., Dirksen, S. and Nelson, J. (2015), ‘Toward a unified theory of sparse dimensionality reduction in Euclidean space’, Geom. Funct. Anal. 25, 1009–1088.CrossRef Google Scholar

Boutsidis, C., Mahoney, M. W. and Drineas, P. (2009), An improved approximation algorithm for the column subset selection problem. In 20th Annual ACM–SIAM Symposium on Discrete Algorithms, SIAM, pp. 968–977.Google Scholar

Boutsidis, C., Woodruff, D. P. and Zhong, P. (2016), Optimal principal component analysis in distributed and streaming models. In 48th Annual ACM Symposium on Theory of Computing, ACM, pp. 236–249.Google Scholar

Briggs, W. L. and Henson, V. E. (1995), The DFT: An Owner’s Manual for the Discrete Fourier Transform, SIAM.Google Scholar

Bringmann, K. and Panagiotou, K. (2017), ‘Efficient sampling methods for discrete distributions’, Algorithmica 79, 484–508.CrossRef Google Scholar

Candès, E., Demanet, L. and Ying, L. (2009), ‘A fast butterfly algorithm for the computation of Fourier integral operators’, Multiscale Model. Simul. 7, 1727–1750.CrossRef Google Scholar

Carl, B. (1985), ‘Inequalities of Bernstein–Jackson-type and the degree of compactness of operators in Banach spaces’, Ann. Inst. Fourier (Grenoble) 35, 79–118.Google Scholar

Carmon, Y., Duchi, J. C., Sidford, A. and Tian, K. (2019), A rank-1 sketch for matrix multiplicative weights. In 32nd Conference on Learning Theory (Beygelzimer, A. and Hsu, D., eds), Vol. 99 of Proceedings of Machine Learning Research, PMLR, pp. 589–623.Google Scholar

Chandrasekaran, V., Recht, B., Parrilo, P. A. and Willsky, A. S. (2012), ‘The convex geometry of linear inverse problems’, Found. Comput. Math. 12, 805–849.Google Scholar

Charikar, M., Chen, K. and Farach-Colton, M. (2004), ‘Finding frequent items in data streams’, Theoret. Comput. Sci. 312, 3–15.Google Scholar

Chen, L. H. Y., Goldstein, L. and Shao, Q.-M. (2011), Normal Approximation by Stein’s Method, Probability and its Applications (New York), Springer.CrossRef Google Scholar

Chen, X. and Price, E. (2019), Active regression via linear-sample sparsification. In 32nd Conference on Learning Theory (Beygelzimer, A. and Hsu, D., eds), Vol. 99 of Proceedings of Machine Learning Research, PMLR, pp. 663–695.Google Scholar

Chen, Z. and Dongarra, J. J. (2005), ‘Condition numbers of Gaussian random matrices’, SIAM J. Matrix Anal. Appl. 27, 603–620.CrossRef Google Scholar

Cheng, H., Gimbutas, Z., Martinsson, P.-G. and Rokhlin, V. (2005), ‘On the compression of low rank matrices’, SIAM J. Sci. Comput. 26, 1389–1404.CrossRef Google Scholar

Clarkson, K. L. and Woodruff, D. P. (2009), Numerical linear algebra in the streaming model. In 41st Annual ACM Symposium on Theory of Computing, ACM, pp. 205–214.CrossRef Google Scholar

Clarkson, K. L. and Woodruff, D. P. (2013), Low rank approximation and regression in input sparsity time. In 2013 ACM Symposium on Theory of Computing (STOC ’13), ACM, pp. 81–90.Google Scholar

Cohen, A. and Migliorati, G. (2017), ‘Optimal weighted least-squares methods’, SIAM J. Comput. Math. 3, 181–203.CrossRef Google Scholar

Cohen, A., Davenport, M. A. and Leviatan, D. (2013), ‘On the stability and accuracy of least squares approximations’, Found. Comput. Math. 13, 819–834.CrossRef Google Scholar

Cohen, E. and Lewis, D. D. (1999), ‘Approximating matrix multiplication for pattern recognition tasks’, J. Algorithms 30, 211–252.Google Scholar

Cohen, M. B. (2016), Nearly tight oblivious subspace embeddings by trace inequalities. In 27th Annual ACM–SIAM Symposium on Discrete Algorithms, ACM, pp. 278–287.Google Scholar

Cohen, M. B., Kyng, R., Miller, G. L., Pachocki, J. W., Peng, R., Rao, A. B. and Xu, S. C. (2014), Solving SDD linear systems in nearly

time. In 46th Annual ACM Symposium on Theory of Computing (STOC ’14), ACM, pp. 343–352.CrossRef Google Scholar

Cullum, J. and Donath, W. E. (1974), A block generalization of the

$s$ -step Lanczos algorithm. Report RC 4845 (21570), IBM Thomas J. Watson Research Center, Yorktown Heights, New York.Google Scholar

Dao, T., Gu, A., Eichhorn, M., Rudra, A. and Ré, C. (2019), Learning fast algorithms for linear transforms using butterfly factorizations. In 36th International Conference on Machine Learning (Chaudhuri, K. and Salakhutdinov, R., eds), Vol. 97 of Proceedings of Machine Learning Research, PMLR, pp. 1517–1527.Google Scholar

Davidson, K. R. and Szarek, S. J. (2001), Local operator theory, random matrices and Banach spaces. In Handbook of the Geometry of Banach Spaces, Vol. I, North-Holland, pp. 317–366.Google Scholar

Davis, T. A., Rajamanickam, S. and Sid-Lakhdar, W. M. (2016), A survey of direct methods for sparse linear systems. In Acta Numerica, Vol. 25, Cambridge University Press, pp. 383–566.Google Scholar

Demmel, J., Dumitriu, I. and Holtz, O. (2007), ‘Fast linear algebra is stable’, Numer. Math. 108, 59–91.Google Scholar

Demmel, J., Grigori, L., Hoemmen, M. and Langou, J. (2012), ‘Communication-optimal parallel and sequential QR and LU factorizations’, SIAM J. Sci. Comput. 34, A206–A239.CrossRef Google Scholar

Demmel, J. W., Grigori, L., Gu, M. and Xiang, H. (2015), ‘Communication avoiding rank revealing QR factorization with column pivoting’, SIAM J. Matrix Anal. Appl. 36, 55–89.Google Scholar

Dixon, J. D. (1983), ‘Estimating extremal eigenvalues and condition numbers of matrices’, SIAM J. Numer. Anal. 20, 812–814.CrossRef Google Scholar

Dobriban, E. and Liu, S. (2019), Asymptotics for sketching in least squares regression. In Advances in Neural Information Processing Systems 32 (Wallach, H. et al., eds), Curran Associates, pp. 3675–3685.Google Scholar

Drineas, P. and Mahoney, M. W. (2018), Lectures on randomized linear algebra. In The Mathematics of Data (Mahoney, M. W., Duchi, J. and Gilbert, A., eds), Vol. 25 of IAS/Park City Mathematics Series, AMS, pp. 1–48.CrossRef Google Scholar

Drineas, P. and Mahoney, M. W. (2005), ‘On the Nyström method for approximating a Gram matrix for improved kernel-based learning’, J. Mach. Learn. Res. 6, 2153–2175.Google Scholar

Drineas, P., Kannan, R. and Mahoney, M. W. (2006 a), ‘Fast Monte Carlo algorithms for matrices, I: Approximating matrix multiplication’, SIAM J. Comput. 36, 132–157.CrossRef Google Scholar

Drineas, P., Kannan, R. and Mahoney, M. W. (2006 b), ‘Fast Monte Carlo algorithms for matrices, II: Computing a low-rank approximation to a matrix’, SIAM J. Comput. 36, 158–183.CrossRef Google Scholar

Drineas, P., Kannan, R. and Mahoney, M. W. (2006 c), ‘Fast Monte Carlo algorithms for matrices, III: Computing a compressed approximate matrix decomposition’, SIAM J. Comput. 36, 184–206.Google Scholar

Drineas, P., Mahoney, M. W. and Muthukrishnan, S. (2006 d), Subspace sampling and relative-error matrix approximation: Column-based methods. In Approximation, Randomization and Combinatorial Optimization, Vol. 4110 of Lecture Notes in Computer Science, Springer, pp. 316–326.CrossRef Google Scholar

Drineas, P., Mahoney, M. W. and Muthukrishnan, S. (2008), ‘Relative-error CUR matrix decompositions’, SIAM J. Matrix Anal. Appl. 30, 844–881.CrossRef Google Scholar

Duersch, J. A. and Gu, M. (2015), True BLAS-3 performance QRCP using random sampling. arXiv:1509.06820v1 Google Scholar

Duersch, J. A. and Gu, M. (2017), ‘Randomized QR with column pivoting’, SIAM J. Sci. Comput. 39, C263–C291.Google Scholar

Edelman, A. S. (1989), Eigenvalues and condition numbers of random matrices. ProQuest LLC, Ann Arbor, MI. PhD thesis, Massachusetts Institute of Technology.Google Scholar

Efron, B. (1982), The Jackknife, the Bootstrap and Other Resampling Plans, Vol. 38 of CBMS-NSF Regional Conference Series in Applied Mathematics, SIAM.CrossRef Google Scholar

Feldman, D., Volkov, M. and Rus, D. (2016), Dimensionality reduction of massive sparse datasets using coresets. In Advances in Neural Information Processing Systems 29 (Lee, D. D. et al., eds), Curran Associates, pp. 2766–2774.Google Scholar

Feng, Y., Xiao, J. and Gu, M. (2019), ‘Flip-flop spectrum-revealing QR factorization and its applications to singular value decomposition’, Electron. Trans. Numer. Anal. 51, 469–494.CrossRef Google Scholar

Fierro, R. D., Hansen, P. C. and Hansen, P. S. K. (1999), ‘UTV tools: Matlab templates for rank-revealing UTV decompositions’, Numer. Algorithms 20, 165–194.Google Scholar

Fine, S. and Scheinberg, K. (2001), ‘Efficient SVM training using low-rank kernel representation’, J. Mach. Learn. Res. 2, 243–264.Google Scholar

Fitzsimons, J. K., Osborne, M. A., Roberts, S. J. and Fitzsimons, J. F. (2018), Improved stochastic trace estimators using mutually unbiased bases. In Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference (Globerson, A. and Silva, R., eds), AUAI Press.Google Scholar

Foucart, S. and Rauhut, H. (2013), A Mathematical Introduction to Compressive Sensing, Applied and Numerical Harmonic Analysis, Birkhäuser/Springer.CrossRef Google Scholar

Frieze, A., Kannan, R. and Vempala, S. (2004), ‘Fast Monte-Carlo algorithms for finding low-rank approximations’, J. Assoc. Comput. Mach. 51, 1025–1041.CrossRef Google Scholar

Ghashami, M., Liberty, E., Phillips, J. M. and Woodruff, D. P. (2016 a), ‘Frequent directions: Simple and deterministic matrix sketching’, SIAM J. Comput. 45, 1762–1792.Google Scholar

Ghashami, M., Perry, D. J. and Phillips, J. (2016 b), Streaming kernel principal component analysis. In 19th International Conference on Artificial Intelligence and Statistics (Gretton, A. and Robert, C. C., eds), Vol. 51 of Proceedings of Machine Learning Research, PMLR, pp. 1365–1374.Google Scholar

Ghysels, P., Li, X. S., Gorman, C. and Rouet, F.-H. (2017), A robust parallel preconditioner for indefinite systems using hierarchical matrices and randomized sampling. In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), IEEE, pp. 897–906.CrossRef Google Scholar

Gionis, A., Indyk, P. and Motwani, R. (1999), Similarity search in high dimensions via hashing. In 25th International Conference on Very Large Data Bases (VLDB ’99), Morgan Kaufmann, pp. 518–529.Google Scholar

Girard, D. A. (1989), ‘A fast “Monte Carlo cross-validation” procedure for large least squares problems with noisy data’, Numer. Math. 56, 1–23.Google Scholar

Gittens, A. (2013), Topics in Randomized Numerical Linear Algebra. ProQuest LLC, Ann Arbor, MI. PhD thesis, California Institute of Technology.Google Scholar

Goldstein, L., Nourdin, I. and Peccati, G. (2017), ‘Gaussian phase transitions and conic intrinsic volumes: Steining the Steiner formula’, Ann. Appl. Probab. 27, 1–47.CrossRef Google Scholar

Golub, G. H. and Meurant, G. (1994), Matrices, moments and quadrature. In Numerical Analysis 1993, Vol. 303 of Pitman Research Notes in Mathematics Series, Longman Scientific and Technical, pp. 105–156.Google Scholar

Golub, G. H. and Meurant, G. (2010), Matrices, Moments and Quadrature with Applications, Princeton Series in Applied Mathematics, Princeton University Press.CrossRef Google Scholar

Golub, G. H. and Underwood, R. (1977), The block Lanczos method for computing eigenvalues. In Mathematical Software III (Proceedings of a Symposium Conducted by the Mathematics Research Center, the University of Wisconsin–Madison), Academic Press, pp. 361–377.Google Scholar

Golub, G. H. and Van Loan, C. F. (2013), Matrix Computations, fourth edition, Johns Hopkins Studies in the Mathematical Sciences, The Johns Hopkins University Press.Google Scholar

Golub, G. H., Luk, F. T. and Overton, M. L. (1981), ‘A block Lánczos method for computing the singular values of corresponding singular vectors of a matrix’, ACM Trans. Math. Software 7, 149–169.CrossRef Google Scholar

Good, I. J. (1977), ‘A new formula for

$k$ -statistics’, Ann. Statist. 5, 224–228.CrossRef Google Scholar

Gopal, A. and Martinsson, P.-G. (2018), The PowerURV algorithm for computing rank-revealing full factorizations. arXiv:1812.06007 Google Scholar

Gordon, Y. (1988), On Milman’s inequality and random subspaces which escape through a mesh in

${\mathbb{R}}^n$ . In Geometric Aspects of Functional Analysis (1986/87), Vol. 1317 of Lecture Notes in Mathematics, Springer, pp. 84–106.CrossRef Google Scholar

Goreinov, S. A., Oseledets, I. V., Savostyanov, D. V., Tyrtyshnikov, E. E. and Zamarashkin, N. L. (2010), How to find a good submatrix. In Matrix Methods: Theory, Algorithms And Applications: Dedicated to the Memory of Gene Golub, World Scientific, pp. 247–256.CrossRef Google Scholar

Goreinov, S. A., Zamarashkin, N. L. and Tyrtyshnikov, E. E. (1997), ‘Pseudo-skeleton approximations by matrices of maximal volume’, Math. Notes 62, 515–519.CrossRef Google Scholar

Gower, R. and Richtárik, P. (2015), ‘Randomized iterative methods for linear systems’, SIAM J. Matrix Anal. Appl. 36, 1660–1690.CrossRef Google Scholar

Gower, R., Hanzely, F., Richtárik, P. and Stich, S. U. (2018), Accelerated stochastic matrix inversion: General theory and speeding up BFGS rules for faster second-order optimization. In Advances in Neural Information Processing Systems 31 (Bengio, S. et al., eds), Curran Associates, pp. 1619–1629.Google Scholar

Grasedyck, L. and Hackbusch, W. (2003), ‘Construction and arithmetics of

$\mathbf{\mathcal{H}}$ -matrices’, Computing 70, 295–334.CrossRef Google Scholar

Gratton, S. and Titley-Peloquin, D. (2018), ‘Improved bounds for small-sample estimation’, SIAM J. Matrix Anal. Appl. 39, 922–931.CrossRef Google Scholar

Greengard, L. and Rokhlin, V. (1987), ‘A fast algorithm for particle simulations’, J. Comput. Phys. 73, 325–348.CrossRef Google Scholar

Grimmett, G. R. and Stirzaker, D. R. (2001), Probability and Random Processes, third edition, Oxford University Press.Google Scholar

Gu, M. (2015), ‘Subspace iteration randomization and singular value problems’, SIAM J. Sci. Comput. 37, A1139–A1173.Google Scholar

Gu, M. and Eisenstat, S. C. (1996), ‘Efficient algorithms for computing a strong rank-revealing QR factorization’, SIAM J. Sci. Comput. 17, 848–869.CrossRef Google Scholar

Guo, H., Liu, Y., Hu, J. and Michielssen, E. (2017), ‘A butterfly-based direct integral-equation solver using hierarchical LU factorization for analyzing scattering from electrically large conducting objects’, IEEE Trans. Antennas Propagation 65, 4742–4750.CrossRef Google Scholar

Hackbusch, W. (1999), ‘A sparse matrix arithmetic based on H-matrices, I: Introduction to H-matrices’, Computing 62, 89–108.CrossRef Google Scholar

Hackbusch, W., Khoromskij, B. and Sauter, S. (2002), On

${\mathbf{\mathcal{H}}}^2$ -matrices. In Lectures on Applied Mathematics (Bungartz, H.-J., Hoppe, R. H. W. and Zenger, C., eds), Springer, pp. 9–29.Google Scholar

Halko, N., Martinsson, P. G. and Tropp, J. A. (2011 a), ‘Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions’, SIAM Rev. 53, 217–288.CrossRef Google Scholar

Halko, N., Martinsson, P.-G., Shkolnisky, Y. and Tygert, M. (2011 b), ‘An algorithm for the principal component analysis of large data sets’, SIAM J. Sci. Comput. 33, 2580–2594.CrossRef Google Scholar

Hamid, R., Xiao, Y., Gittens, A. and Decoste, D. (2014), Compact random feature maps. In 31st International Conference on Machine Learning (Xing, E. P. and Jebara, T., eds), Vol. 32 of Proceedings of Machine Learning Research, PMLR, pp. 19–27.Google Scholar

Hampton, J. and Doostan, A. (2015), ‘Coherence motivated sampling and convergence analysis of least squares polynomial chaos regression’, Comput. Methods Appl. Mech. Engrg 290, 73–97.CrossRef Google Scholar

Hennig, P. and Osborne, M. A. (2019), probabilistic-numerics.org Google Scholar

Hestenes, M. R. and Stiefel, E. (1952), ‘Methods of conjugate gradients for solving linear systems’, J. Res. Nat. Bur. Standards 49, 2379.CrossRef Google Scholar

Horn, R. A. and Johnson, C. R. (2013), Matrix Analysis, second edition, Cambridge University Press.Google Scholar

Hutchinson, M. F. (1990), ‘A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines’, Comm. Statist. Simul. Comput. 19, 433–450.CrossRef Google Scholar

Indyk, P. and Motwani, R. (1999), Approximate nearest neighbors: Towards removing the curse of dimensionality. In 30th Annual ACM Symposium on Theory of Computing (STOC ’98), ACM, pp. 604–613.Google Scholar

Jin, R., Kolda, T. G. and Ward, R. (2019), Faster Johnson–Lindenstrauss transforms via Kronecker products. arXiv:1909.04801 Google Scholar

Johnson, W. B. and Lindenstrauss, J. (1984), Extensions of Lipschitz mappings into a Hilbert space. In Conference in Modern Analysis and Probability (New Haven, Conn., 1982), Vol. 26 of Contemporary Mathematics, AMS, pp. 189–206.CrossRef Google Scholar

Johnstone, I. M. (2001), ‘On the distribution of the largest eigenvalue in principal components analysis’, Ann. Statist. 29, 295–327.CrossRef Google Scholar

Joubert, W. D. and Carey, G. F. (1991), Parellelizable restarted iterative methods for nonsymmetric linear systems. Center for Numerical Analysis Report CNA-251, UT-Austin.CrossRef Google Scholar

Kac, M. (1956), Foundations of kinetic theory. In Third Berkeley Symposium on Mathematical Statistics and Probability, 1954–1955, Vol. III, University of California Press, pp. 171–197.Google Scholar

Kahan, W. (1966), ‘Numerical linear algebra’, Canad. Math. Bull. 9, 757–801.Google Scholar

Kannan, R. and Vempala, S. (2017), Randomized algorithms in numerical linear algebra. In Acta Numerica, Cambridge University Press, Vol. 26, pp. 95–135.Google Scholar

Kar, P. and Karnick, H. (2012), Random feature maps for dot product kernels. In 15th International Conference on Artificial Intelligence and Statistics (Lawrence, N. D. and Girolami, M., eds), Vol. 22 of Proceedings of Machine Learning Research, PMLR, pp. 583–591.Google Scholar

Kasiviswanathan, S. P., Rudelson, M., Smith, A. and Ullman, J. (2010), The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In 2010 ACM International Symposium on Theory of Computing (STOC ’10), ACM, pp. 775–784.Google Scholar

Kong, W. and Valiant, G. (2017), ‘Spectrum estimation from samples’, Ann. Statist. 45, 2218–2247.Google Scholar

Koroljuk, V. S. and Borovskich, Y. V. (1994), Theory of

$U$ -Statistics, Vol. 273 of Mathematics and its Applications, Kluwer Academic.CrossRef Google Scholar

Krahmer, F. and Ward, R. (2011), ‘New and improved Johnson–Lindenstrauss embeddings via the restricted isometry property’, SIAM J. Math. Anal. 43, 1269–1281.Google Scholar

Kressner, D., Steinlechner, M. and Vandereycken, B. (2016), ‘Preconditioned low-rank Riemannian optimization for linear systems with tensor product structure’, SIAM J. Sci. Comput. 38, A2018–A2044.CrossRef Google Scholar

Kuczyński, J. and Woźniakowski, H. (1992), ‘Estimating the largest eigenvalue by the power and Lanczos algorithms with a random start’, SIAM J. Matrix Anal. Appl. 13, 1094–1122.CrossRef Google Scholar

Kueng, R. (2019), 2-designs minimize variance of trace estimators. Unpublished manuscript.Google Scholar

Kumar, S., Mohri, M. and Talwalkar, A. (2012), ‘Sampling methods for the Nyström method’, J. Mach. Learn. Res. 13, 981–1006.Google Scholar

Kurz, S., Rain, O. and Rjasanow, S. (2002), ‘The adaptive cross-approximation technique for the 3D boundary-element method’, IEEE Trans. Magnetics 38, 421–424.Google Scholar

Kyng, R. (2017), Approximate Gaussian elimination. ProQuest LLC, Ann Arbor, MI. PhD thesis, Yale University.Google Scholar

Kyng, R. and Sachdeva, S. (2016), Approximate Gaussian elimination for Laplacians: Fast, sparse, and simple. In 57th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2016), IEEE, pp. 573–582.CrossRef Google Scholar

Kyng, R., Lee, Y. T., Peng, R., Sachdeva, S. and Spielman, D. A. (2016), Sparsified Cholesky and multigrid solvers for connection Laplacians. In 48th Annual ACM SIGACT Symposium on Theory of Computing (STOC ’16), ACM, pp. 842–850.Google Scholar

Le, Q., Sarlós, T. and Smola, A. (2013), Fastfood: Computing Hilbert space expansions in loglinear time. In 30th International Conference on Machine Learning (Dasgupta, S. and McAllester, D., eds), Vol. 28 of Proceedings of Machine Learning Research, PMLR, pp. 244–252.Google Scholar

Ledoux, M. and Talagrand, M. (1991), Probability in Banach Spaces: Isoperimetry and Processes, Vol. 23 of Ergebnisse der Mathematik und ihrer Grenzgebiete [Results in Mathematics and Related Areas] (3), Springer.CrossRef Google Scholar

Leventhal, D. and Lewis, A. S. (2010), ‘Randomized methods for linear constraints: Convergence rates and conditioning’, Math. Oper. Res. 35, 641–654.CrossRef Google Scholar

Leyk, Z. and Woźniakowski, H. (1998), ‘Estimating a largest eigenvector by Lanczos and polynomial algorithms with a random start’, Numer. Linear Algebra Appl. 5, 147–164.3.0.CO;2-2>CrossRef Google Scholar

Li, H., Linderman, G. C., Szlam, A., Stanton, K. P., Kluger, Y. and Tygert, M. (2017), ‘Algorithm 971: An implementation of a randomized algorithm for principal component analysis’, ACM Trans. Math. Software 43, Art. 28, 14.CrossRef Google Scholar

Li, M., Bi, W., Kwok, J. T. and Lu, B. (2015 a), ‘Large-scale Nyström kernel matrix approximation using randomized SVD’, IEEE Trans. Neural Networks Learning Syst. 26, 152–164.Google Scholar PubMed

Li, Y. and Yang, H. (2017), ‘Interpolative butterfly factorization’, SIAM J. Sci. Comput. 39, A503–A531.CrossRef Google Scholar

Li, Y., Nguyen, H. L. and Woodruff, D. P. (2014 a), On sketching matrix norms and the top singular vector. In 25th Annual ACM–SIAM Symposium on Discrete Algorithms, ACM, pp. 1562–1581.Google Scholar

Li, Y., Nguyen, H. L. and Woodruff, D. P. (2014 b), Turnstile streaming algorithms might as well be linear sketches. In 2014 ACM Symposium on Theory of Computing (STOC ’14), ACM, pp. 174–183.Google Scholar

Li, Y., Yang, H. and Ying, L. (2018), ‘Multidimensional butterfly factorization’, Appl. Comput. Harmon. Anal. 44, 737–758.CrossRef Google Scholar

Li, Y., Yang, H., Martin, E. R., Ho, K. L. and Ying, L. (2015 b), ‘Butterfly factorization’, Multiscale Model. Simul. 13, 714–732.Google Scholar

Liberty, E. (2009), Accelerated dense random projections. PhD thesis, Computer Science, Yale University.Google Scholar

Liberty, E., Woolfe, F., Martinsson, P.-G., Rokhlin, V. and Tygert, M. (2007), ‘Randomized algorithms for the low-rank approximation of matrices’, Proc. Nat. Acad. Sci. USA 104, 20167–20172.CrossRef Google Scholar PubMed

Lim, L.-H. and Weare, J. (2017), ‘Fast randomized iteration: Diffusion Monte Carlo through the lens of numerical linear algebra’, SIAM Review 59, 547–587.CrossRef Google Scholar

Lin, L., Lu, J. and Ying, L. (2011), ‘Fast construction of hierarchical matrix representation from matrix–vector multiplication’, J. Comput. Phys. 230, 4071–4087.Google Scholar

Linial, N., London, E. and Rabinovich, Y. (1995), ‘The geometry of graphs and some of its algorithmic applications’, Combinatorica 15, 215–245.CrossRef Google Scholar

Lopes, M. E. (2019), ‘Estimating the algorithmic variance of randomized ensembles via the bootstrap’, Ann. Statist. 47, 1088–1112.Google Scholar

Lopez-Paz, D., Sra, S., Smola, A., Ghahramani, Z. and Schölkopf, B. (2014), Randomized nonlinear component analysis. In 31st International Conference on Machine Learning (Xing, E. P. and Jebara, T., eds), Vol. 32 of Proceedings of Machine Learning Research, PMLR, pp. 1359–1367.Google Scholar

Lust-Piquard, F. (1986), ‘Inégalités de Khintchine dans

${C}_p\left(1\lt p\lt \infty \right)$ ’, C.R. Acad. Sci. Paris Sér. I Math. 303, 289–292.Google Scholar

Mackey, L., Jordan, M. I., Chen, R. Y., Farrell, B. and Tropp, J. A. (2014), ‘Matrix concentration inequalities via the method of exchangeable pairs’, Ann. Probab. 42, 906–945.CrossRef Google Scholar

Mahoney, M. W. (2011), ‘Randomized algorithms for matrices and data’, Found. Trends Mach. Learn. 3, 123–224.Google Scholar

Mahoney, M. W. and Drineas, P. (2009), ‘CUR matrix decompositions for improved data analysis’, Proc. Nat. Acad. Sci. USA 106, 697–702.CrossRef Google Scholar PubMed

Malik, O. A. and Becker, S. (2019), Guarantees for the Kronecker fast Johnson–Lindenstrauss transform using a coherence and sampling argument. arXiv:1911.08424 Google Scholar

March, W. B., Xiao, B. and Biros, G. (2015), ‘ASKIT: Approximate skeletonization kernel-independent treecode in high dimensions’, SIAM J. Sci. Comput. 37, A1089–A1110.CrossRef Google Scholar

Marcus, M. B. and Pisier, G. (1981), Random Fourier Series with Applications to Harmonic Analysis, Vol. 101 of Annals of Mathematics Studies, Princeton University Press and University of Tokyo Press.Google Scholar

Martinsson, P.-G. (2008), Rapid factorization of structured matrices via randomized sampling. arXiv:0806.2339 Google Scholar

Martinsson, P.-G. (2011), ‘A fast randomized algorithm for computing a hierarchically semiseparable representation of a matrix’, SIAM J. Matrix Anal. Appl. 32, 1251–1274.CrossRef Google Scholar

Martinsson, P.-G. (2015), Blocked rank-revealing QR factorizations: How randomized sampling can be used to avoid single-vector pivoting. arXiv:1505.08115 Google Scholar

Martinsson, P.-G. (2016), ‘Compressing rank-structured matrices via randomized sampling’, SIAM J. Sci. Comput. 38, A1959–A1986.CrossRef Google Scholar

Martinsson, P.-G. (2018), Randomized methods for matrix computations. In The Mathematics of Data (Mahoney, M. W., Duchi, J. and Gilbert, A., eds), Vol. 25 of IAS/Park City Mathematics Series, AMS, pp. 187–231.CrossRef Google Scholar

Martinsson, P.-G. (2019), Fast Direct Solvers for Elliptic PDEs, Vol. CB96 of CBMS-NSF Conference Series, SIAM.CrossRef Google Scholar

Martinsson, P.-G. and Rokhlin, V. (2005), ‘A fast direct solver for boundary integral equations in two dimensions’, J. Comput. Phys. 205, 1–23.CrossRef Google Scholar

Martinsson, P.-G. and Tropp, J. (2020), Randomized numerical linear algebra: Foundations and algorithms. arXiv:2002.01387 Google Scholar

Martinsson, P.-G. and Voronin, S. (2016), ‘A randomized blocked algorithm for efficiently computing rank-revealing factorizations of matrices’, SIAM J. Sci. Comput. 38, S485–S507.Google Scholar

Martinsson, P.-G., Rokhlin, V. and Tygert, M. (2006 a), A randomized algorithm for the approximation of matrices. Yale CS research report YALEU/DCS/RR-1361, Computer Science Department, Yale University.Google Scholar

Martinsson, P.-G., Rokhlin, V. and Tygert, M. (2006 b), ‘On interpolation and integration in finite-dimensional spaces of bounded functions’, Comm. Appl. Math. Comput. Sci. 1, pp. 133–142.CrossRef Google Scholar

Martinsson, P.-G., Orti, G. Quintana and Heavner, N. (2019), randUTV: A blocked randomized algorithm for computing a rank-revealing UTV factorization. ACM Trans. Math. Software 45, 4.CrossRef Google Scholar

Martinsson, P.-G., Quintana-Ortí, G., Heavner, N. and van de Geijn, R. (2015), Householder QR factorization with randomization for column pivoting (HQRRP). arXiv:1512.02671 Google Scholar

Martinsson, P.-G., Quintana-Ortí, G., Heavner, N. and van de Geijn, R. (2017), ‘Householder QR factorization with randomization for column pivoting (HQRRP)’, SIAM J. Sci. Comput. 39, C96–C115.CrossRef Google Scholar

Mathias, R. and Stewart, G. W. (1993), ‘A block {QR} algorithm and the singular value decomposition’, Linear Algebra Appl. 182, 91–100.CrossRef Google Scholar

McCoy, M. B. and Tropp, J. A. (2013), The achievable performance of convex demixing. ACM Technical Report 2017-02, Caltech.Google Scholar

McCoy, M. B. and Tropp, J. A. (2014), ‘From Steiner formulas for cones to concentration of intrinsic volumes’, Discrete Comput. Geom. 51, 926–963.Google Scholar

Melgaard, C. and Gu, M. (2015), Gaussian elimination with randomized complete pivoting. arXiv:1511.08528 Google Scholar

Mendelson, S., Rauhut, H. and Ward, R. (2018), ‘Improved bounds for sparse recovery from subsampled random convolutions’, Ann. Appl. Probab. 28, 3491–3527.CrossRef Google Scholar

Meng, X. and Mahoney, M. W. (2013), Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In 2013 ACM Symposium on Theory of Computing (STOC ’13), ACM, pp. 91–100.Google Scholar

Meng, X., Saunders, M. A. and Mahoney, M. W. (2014), ‘LSRN: A parallel iterative solver for strongly over- or underdetermined systems’, SIAM J. Sci. Comput. 36, C95–C118.CrossRef Google Scholar PubMed

Mezzadri, F. (2007), ‘How to generate random matrices from the classical compact groups’, Notices Amer. Math. Soc. 54, 592–604.Google Scholar

Moulines, E. and Bach, F. R. (2011), Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In Advances in Neural Information Processing Systems 24 (Shawe-Taylor, J. et al., eds), Curran Associates, pp. 451–459.Google Scholar

Muirhead, R. J. (1982), Aspects of Multivariate Statistical Theory, Wiley Series in Probability and Mathematical Statistics, Wiley.Google Scholar

Musco, C. and Musco, C. (2015), Randomized block Krylov methods for stronger and faster approximate singular value decomposition. In Advances in Neural Information Processing Systems 28 (Cortes, C. et al., eds), Curran Associates, pp. 1396–1404.Google Scholar

Musco, C. and Musco, C. (2017), Recursive sampling for the Nyström method. In Advances in Neural Information Processing Systems 30 (Guyon, I. et al., eds), Curran Associates, pp. 3833–3845.Google Scholar

Musco, C., Musco, C. and Sidford, A. (2018), Stability of the Lanczos method for matrix function approximation. In 29th Annual ACM–SIAM Symposium on Discrete Algorithms, SIAM, pp. 1605–1624.Google Scholar

Muthukrishnan, S. (2005), ‘Data streams: Algorithms and applications’, Found. Trends Theor. Comput. Sci. 1, 117–236.Google Scholar

Neal, R. M. (1996), Priors for Infinite Networks, Springer, pp. 29–53.Google Scholar

Needell, D. and Tropp, J. A. (2014), ‘Paved with good intentions: Analysis of a randomized block Kaczmarz method’, Linear Algebra Appl. 441, 199–221.CrossRef Google Scholar

Needell, D., Ward, R. and Srebro, N. (2014), Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm. In Advances in Neural Information Processing Systems 27 (Ghahramani, Z. et al., eds), Curran Associates, pp. 1017–1025.Google Scholar

Nelson, J. and Nguyen, H. L. (2013), OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science (FOCS 2013), IEEE, pp. 117–126.CrossRef Google Scholar

The Numerical Algorithms Group (NAG) (2019), The NAG Library Mark 27. https://www.nag.com/content/naglibrary-mark27 Google Scholar

Nyström, E. J. (1930), ‘Über Die Praktische Auflösung von Integralgleichungen mit Anwendungen auf Randwertaufgaben’, Acta Math. 54, 185–204.Google Scholar

Oja, E. (1982), ‘A simplified neuron model as a principal component analyzer’, J. Math. Biol. 15, 267–273.CrossRef Google Scholar PubMed

Oliveira, R. I. (2009 a), Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges. arXiv:0911.0600 Google Scholar

Oliveira, R. I. (2009 b), ‘On the convergence to equilibrium of Kac’s random walk on matrices’, Ann. Appl. Probab. 19, 1200–1231.CrossRef Google Scholar

O’Neil, M. (2007), A new class of analysis-based fast transforms. PhD thesis, Mathematics, Yale University.Google Scholar

Oymak, S. and Tropp, J. A. (2018), ‘Universality laws for randomized dimension reduction, with applications’, Inf. Inference 7, 337–446.Google Scholar

Pagh, R. (2013), ‘Compressed matrix multiplication’, ACM Trans. Comput. Theory 5, 9.CrossRef Google Scholar

Pan, V. Y. and Zhao, L. (2017), ‘Numerically safe Gaussian elimination with no pivoting’, Linear Algebra Appl. 527, 349–383.CrossRef Google Scholar

Papadimitriou, C. H., Raghavan, P., Tamaki, H. and Vempala, S. (2000), ‘Latent semantic indexing: A probabilistic analysis’, J. Comput. System Sci. 61, 217–235.CrossRef Google Scholar

Park, H. and Eldén, L. (1995), ‘Downdating the rank-revealing URV decomposition’, SIAM J. Matrix Anal. Appl. 16, 138–155.CrossRef Google Scholar

Parker, D. S. (1995), Random butterfly transformations with applications in computational linear algebra. Report CSD-950023, UCLA.Google Scholar

Parlett, B. N. (1998), The Symmetric Eigenvalue Problem, corrected reprint of the 1980 original, Vol. 20 of Classics in Applied Mathematics, SIAM.Google Scholar

Pham, N. and Pagh, R. (2013), Fast and scalable polynomial kernels via explicit feature maps. In 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’13), ACM, pp. 239–247.CrossRef Google Scholar

Pilanci, M. and Wainwright, M. J. (2015), ‘Randomized sketches of convex programs with sharp guarantees’, IEEE Trans. Inform. Theory 61, 5096–5115.CrossRef Google Scholar

Pilanci, M. and Wainwright, M. J. (2016), ‘Iterative Hessian sketch: Fast and accurate solution approximation for constrained least-squares’, J. Mach. Learn. Res. 17, Paper No. 53, 38.Google Scholar

Pillai, N. S. and Smith, A. (2017), ‘Kac’s walk on

$n$ -sphere mixes in

$n\log n$ steps’, Ann. Appl. Probab. 27, 631–650.Google Scholar

Porod, U. (1996), ‘The cut-off phenomenon for random reflections’, Ann. Probab. 24, 74–96.Google Scholar

Pourkamali-Anaraki, F. and Becker, S. (2019), ‘Improved fixed-rank Nyström approximation via QR decomposition: Practical and theoretical aspects’, Neurocomput. 363, 261–272.CrossRef Google Scholar

Rahimi, A. and Recht, B. (2008), Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems 20 (Platt, J. C. et al., eds), Curran Associates, pp. 1177–1184.Google Scholar

Rahimi, A. and Recht, B. (2009), Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning. In Advances in Neural Information Processing Systems 21 (Koller, D. et al., eds), Curran Associates, pp. 1313–1320.Google Scholar

Rauhut, H. and Ward, R. (2012), ‘Sparse Legendre expansions via

${\ell}_1$ -minimization’, J. Approx. Theory 164, 517–533.CrossRef Google Scholar

Rauhut, H. and Ward, R. (2016), ‘Interpolation via weighted

${\ell}_1$ minimization’, Appl. Comput. Harmon. Anal. 40, 321–351.CrossRef Google Scholar

Rauhut, H., Romberg, J. and Tropp, J. A. (2012), ‘Restricted isometries for partial random circulant matrices’, Appl. Comput. Harmon. Anal. 32, 242–254.CrossRef Google Scholar

Richtárik, P. and Takáč, M. (2020), ‘Stochastic reformulations of linear systems: Algorithms and convergence theory’, SIAM J. Matrix Anal. Appl. o41, 487–524.Google Scholar

Rokhlin, V. and Tygert, M. (2008), ‘A fast randomized algorithm for overdetermined linear least-squares regression’, Proc. Nat. Acad. Sci. USA 105, 13212–13217.CrossRef Google Scholar PubMed

Rokhlin, V., Szlam, A. and Tygert, M. (2009), ‘A randomized algorithm for principal component analysis’, SIAM J. Matrix Anal. Appl. 31, 1100–1124.Google Scholar

Rosenthal, J. S. (1994), ‘Random rotations: Characters and random walks on

$\mathrm{SO}(N)$ ’, Ann. Probab. 22, 398–423.CrossRef Google Scholar

Ross, N. (2011), ‘Fundamentals of Stein’s method’, Probab. Surv. 8, 210–293.Google Scholar

Rudelson, M. (1999), ‘Random vectors in the isotropic position’, J. Funct. Anal. 164, 60–72.CrossRef Google Scholar

Rudelson, M. (2012), ‘Row products of random matrices’, Adv. Math. 231, 3199–3231.CrossRef Google Scholar

Rudelson, M. and Vershynin, R. (2007), ‘Sampling from large matrices: An approach through geometric functional analysis’, J. Assoc. Comput. Mach. 54, 21.CrossRef Google Scholar

Rudelson, M. and Vershynin, R. (2008), ‘On sparse reconstruction from Fourier and Gaussian measurements’, Comm. Pure Appl. Math. 61, 1025–1045.CrossRef Google Scholar

Rudi, A. and Rosasco, L. (2017), Generalization properties of learning with random features. In Advances in Neural Information Processing Systems 30 (Guyon, I. et al., eds), Curran Associates, pp. 3215–3225.Google Scholar

Rudi, A., Calandriello, D., Carratino, L. and Rosasco, L. (2018), On fast leverage score sampling and optimal learning. In Advances in Neural Information Processing Systems 31 (Bengio, S. et al., eds), Curran Associates, pp. 5672–5682.Google Scholar

Rudi, A., Camoriano, R. and Rosasco, L. (2015), Less is more: Nyström computational regularization. In Advances in Neural Information Processing Systems 28 (Cortes, C. et al., eds), Curran Associates, pp. 1657–1665.Google Scholar

Rudi, A., Carratino, L. and Rosasco, L. (2017), FALKON: An optimal large scale kernel method. In Advances in Neural Information Processing Systems 30 (Guyon, I. et al., eds), Curran Associates, pp. 3888–3898.Google Scholar

Samo, Y.-L. K. and Roberts, S. (2015), Generalized spectral kernels. arXiv:1506.02236 Google Scholar

Sankar, A., Spielman, D. A. and Teng, S.-H. (2006), ‘Smoothed analysis of the condition numbers and growth factors of matrices’, SIAM J. Matrix Anal. Appl. 28, 446–476.CrossRef Google Scholar

Sarlós, T. (2006), Improved approximation algorithms for large matrices via random projections. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS ’06), pp. 143–152.CrossRef Google Scholar

Schoenberg, I. J. (1942), ‘Positive definite functions on spheres’, Duke Math. J. 9, 96–108.CrossRef Google Scholar

Schölkopf, B. and Smola, A. (2001), Learning with Kernels, MIT Press.Google Scholar

Schölkopf, B., Smola, A. and Müller, K.-R. (1996), Nonlinear component analysis as a kernel eigenvalue problem. Technical report 44, Max-Planck-Institut für biologische Kybernetik.Google Scholar

Simchowitz, M., Alaoui, A. E. and Recht, B. (2017), On the gap between strict-saddles and true convexity: An Omega

$(\log d)$ lower bound for eigenvector approximation. arXiv:1704.04548 Google Scholar

Sloane, N. J. A. (1983), Encrypting by random rotations. In Cryptography (Burg Feuerstein, 1982), Vol. 149 of Lecture Notes in Computer Science, Springer, pp. 71–128.Google Scholar

Sorensen, D. C. and Embree, M. (2016), ‘A DEIM induced CUR factorization’, SIAM J. Sci. Comput. 38, A1454–A1482.CrossRef Google Scholar

Spielman, D. A. and Srivastava, N. (2011), ‘Graph sparsification by effective resistances’, SIAM J. Comput. 40, 1913–1926.CrossRef Google Scholar

Sriperumbudur, B. and Szabó, Z. (2015), Optimal rates for random Fourier features. In Advances in Neural Information Processing Systems 28 (Cortes, C. et al., eds), Curran Associates, pp. 1144–1152.Google Scholar

Stewart, G. W. (1994), UTV decompositions. In Numerical Analysis 1993, Vol. 303 of Pitman Research Notes in Mathematics Series, Longman Scientific and Technical, pp. 225–225.Google Scholar

Stewart, G. W. (1998), Matrix Algorithms , Vol. 1: Basic Decompositions, SIAM.Google Scholar

Stewart, G. W. (1999), ‘The QLP approximation to the singular value decomposition’, SIAM J. Sci. Comput. 20, 1336–1348.Google Scholar

Stewart, G. W. (2001), Matrix Algorithms , Vol. 2: Eigensystems, SIAM.Google Scholar

Stojnic, M. (2010),

${\ell}_1$ optimization and its various thresholds in compressed sensing. In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3910–3913.CrossRef Google Scholar

Strang, G. (2019), Linear Algebra and Learning from Data, Wellesley-Cambridge Press.Google Scholar

Strassen, V. (1969), ‘Gaussian elimination is not optimal’, Numer. Math. 13, 354–356.CrossRef Google Scholar

Strohmer, T. and Vershynin, R. (2009), ‘A randomized Kaczmarz algorithm with exponential convergence’, J. Fourier Anal. Appl. 15, 262–278.CrossRef Google Scholar

Sun, Y., Guo, Y., Tropp, J. A. and Udell, M. (2018), Tensor random projection for low memory dimension reduction. In 32nd Conference on Neural Information Processing Systems, Montréal, Canada .Google Scholar

Szabó, Z. and Sriperumbudur, B. (2019), On kernel derivative approximation with random Fourier features. In 22nd International Conference on Artificial Intelligence and Statistics (Chaudhuri, K. and Sugiyama, M., eds), Vol. 89 of Proceedings of Machine Learning Research, PMLR, pp. 827–836.Google Scholar

Teng, S.-H. (2010), The Laplacian paradigm: Emerging algorithms for massive graphs. In Theory and Applications of Models of Computation, Vol. 6108 of Lecture Notes in Computer Science, Springer, pp. 2–14.CrossRef Google Scholar

Thrampoulidis, C. and Hassibi, B. (2015), Isotropically random orthogonal matrices: Performance of LASSO and minimum conic singular values. In 2015 IEEE International Symposium on Information Theory (ISIT), pp. 556–560.CrossRef Google Scholar

Thrampoulidis, C., Oymak, S. and Hassibi, B. (2014), The Gaussian min-max theorem in the presence of convexity. arXiv:1408.4837 Google Scholar

Thurau, C., Kersting, K. and Bauckhage, C. (2012), Deterministic CUR for improved large-scale data analysis: An empirical study. In 2012 SIAM International Conference on Data Mining, SIAM, pp. 684–695.Google Scholar

Tomczak-Jaegermann, N. (1974), ‘The moduli of smoothness and convexity and the Rademacher averages of trace classes

${S}_p\left(1\le p<\infty \right)$ ’, Studia Math. 50, 163–182.Google Scholar

Ton, J.-F., Flaxman, S., Sejdinovic, D. and Bhatt, S. (2018), ‘Spatial mapping with Gaussian processes and nonstationary Fourier features’, Spat. Statist. 28, 59–78.CrossRef Google Scholar PubMed

Trefethen, L. N. and Bau III, D. (1997), Numerical Linear Algebra, SIAM.CrossRef Google Scholar

Trogdon, T. (2017), ‘On spectral and numerical properties of random butterfly matrices’, Appl. Math. Lett. 95, 48–58.CrossRef Google Scholar

Tropp, J. A. (2011 a), ‘Freedman’s inequality for matrix martingales’, Electron. Commun. Probab. 16, 262–270.CrossRef Google Scholar

Tropp, J. A. (2011 b), ‘Improved analysis of the subsampled randomized Hadamard transform’, Adv. Adapt. Data Anal. 3, 115–126.CrossRef Google Scholar

Tropp, J. A. (2012 a), ‘A comparison principle for functions of a uniformly random subspace’, Probab. Theory Related Fields 153, 759–769.CrossRef Google Scholar

Tropp, J. A. (2012 b), ‘User-friendly tail bounds for sums of random matrices’, Found. Comput. Math. 12, 389–434.CrossRef Google Scholar

Tropp, J. A. (2015), ‘An introduction to matrix concentration inequalities’, Found. Trends Mach. Learn. 8, 1–230.CrossRef Google Scholar

Tropp, J. A. (2016), The expected norm of a sum of independent random matrices: An elementary approach. In High Dimensional Probability VII, Vol. 71 of Progress in Probability, Springer, pp. 173–202.CrossRef Google Scholar

Tropp, J. A. (2018), Analysis of randomized block Krylov methods. Under revision.Google Scholar

Tropp, J. A. (2019), Matrix concentration and computational linear algebra. CMS Lecture Notes 2019-01, Caltech, Pasadena, CA.Google Scholar

Tropp, J. A., Wakin, M. B., Duarte, M. F., Baron, D. and Baraniuk, R. G. (2006), Random filters for compressive sampling and reconstruction. In 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings, Vol. 3.CrossRef Google Scholar

Tropp, J. A., Yurtsever, A., Udell, M. and Cevher, V. (2017 a), Fixed-rank approximation of a positive-semidefinite matrix from streaming data. In Advances in Neural Information Processing Systems 30 (Guyon, I. et al., eds), Curran Associates, pp. 1225–1234.Google Scholar

Tropp, J. A., Yurtsever, A., Udell, M. and Cevher, V. (2017 b), ‘Practical sketching algorithms for low-rank matrix approximation’, SIAM J. Matrix Anal. Appl. 38, 1454–1485.CrossRef Google Scholar

Tropp, J. A., Yurtsever, A., Udell, M. and Cevher, V. (2019), ‘Streaming low-rank matrix approximation with an application to scientific simulation’, SIAM J. Sci. Comput. 41, A2430–A2463.CrossRef Google Scholar

Ubaru, S., Chen, J. and Saad, Y. (2017), ‘Fast estimation of

$\mathsf{tr}\left(f(A)\right)$ via stochastic Lanczos quadrature’, SIAM J. Matrix Anal. Appl. 38, 1075–1099.CrossRef Google Scholar

Ullah, E., Mianjy, P., Marinov, T. V. and Arora, R. (2018), Streaming kernel PCA with

$\widetilde{O}\left(\sqrt{n}\right)$ random features. In Advances in Neural Information Processing Systems 31 (Bengio, S. et al., eds), Curran Associates, pp. 7311–7321.Google Scholar

Upadhyay, J. (2016), Fast and space-optimal low-rank factorization in the streaming model with application in differential privacy. arXiv:1604.01429 Google Scholar

Urano, Y. (2013), A fast randomized algorithm for linear least-squares regression via sparse transforms. Master’s thesis, New York University.Google Scholar

Van Handel, R. (2016), Probability in high dimension. APC 550 Lecture Notes, Princeton University.Google Scholar

Vershynin, R. (2018), High-Dimensional Probability: An Introduction with Applications in Data Science, Vol. 47 of Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press.CrossRef Google Scholar

Vershynin, R. (2019), Concentration inequalities for random tensors. arXiv:1905.00802 Google Scholar

Voronin, S. and Martinsson, P.-G. (2017), ‘Efficient algorithms for CUR and interpolative matrix decompositions’, Adv. Comput. Math. 43, 495–516.CrossRef Google Scholar

Waldron, S. F. D. (2018), An Introduction to Finite Tight Frames, Applied and Numerical Harmonic Analysis, Birkhäuser/Springer.CrossRef Google Scholar

Wang, S. (2019), Simple and almost assumption-free out-of-sample bound for random feature mapping. arXiv:1909.11207 Google Scholar

Wang, S., Gittens, A. and Mahoney, M. W. (2019), ‘Scalable kernel

$k$ -means clustering with Nyström approximation: Relative-error bounds’, J. Mach. Learn. Res. 20, 431–479.Google Scholar

Whaley, R. C. and Dongarra, J. J. (1998), Automatically tuned linear algebra software. In 1998 ACM/IEEE Conference on Supercomputing (SC ’98), IEEE, pp. 1–27.Google Scholar

Williams, C. K. I. and Seeger, M. (2001), Using the Nyström method to speed up kernel machines. In Advances in Neural Information Processing Systems 13 (Leen, T. K., Dietterich, T. G. and Tresp, V., eds), MIT Press, pp. 682–688.Google Scholar

Woodruff, D. P. (2014), ‘Sketching as a tool for numerical linear algebra’, Found. Trends Theor. Comput. Sci. 10, 1–157.Google Scholar

Woolfe, F., Liberty, E., Rokhlin, V. and Tygert, M. (2008), ‘A fast randomized algorithm for the approximation of matrices’, Appl. Comput. Harmon. Anal. 25, 335–366.CrossRef Google Scholar

Wootters, W. K. and Fields, B. D. (1989), ‘Optimal state-determination by mutually unbiased measurements’, Ann. Phys. 191, 363–381.CrossRef Google Scholar

Xia, J. (2013), ‘Randomized sparse direct solvers’, SIAM J. Matrix Anal. Appl. 34, 197–227.CrossRef Google Scholar

Xiao, J., Gu, M. and Langou, J. (2017), Fast parallel randomized QR with column pivoting algorithms for reliable low-rank matrix approximations. In 2017 IEEE 24th International Conference on High Performance Computing (HiPC), IEEE, pp. 233–242.Google Scholar

Yu, C. D., Levitt, J., Reiz, S. and Biros, G. (2017 a), Geometry-oblivious FMM for compressing dense SPD matrices. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’17), ACM, art. 53.Google Scholar

Yu, W., Gu, Y., Li, J., Liu, S. and Li, Y. (2017 b), Single-pass PCA of large high-dimensional data. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17), pp. 3350–3356.Google Scholar

Yuan, Q., Gu, M. and Li, B. (2018), Superlinear convergence of randomized block Lanczos algorithm. In 2018 IEEE International Conference on Data Mining (ICDM), pp. 1404–1409.CrossRef Google Scholar

Zhang, F., ed. (2005), The Schur Complement and its Applications , Vol. 4 of Numerical Methods and Algorithms, Springer.CrossRef Google Scholar

Zouzias, A. (2013), Randomized primitives for linear algebra and applications. PhD thesis, University of Toronto.Google Scholar

Article contents

Randomized numerical linear algebra: Foundations and algorithms

Abstract

Access options

References

References 2

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests