skip to main content

Algorithm 971: An Implementation of a Randomized Algorithm for Principal Component Analysis

Authors Info & Claims
Published:09 January 2017Publication History
Skip Abstract Section

Abstract

Recent years have witnessed intense development of randomized methods for low-rank approximation. These methods target principal component analysis and the calculation of truncated singular value decompositions. The present article presents an essentially black-box, foolproof implementation for Mathworks’ MATLAB, a popular software platform for numerical computation. As illustrated via several tests, the randomized algorithms for low-rank approximation outperform or at least match the classical deterministic techniques (such as Lanczos iterations run to convergence) in basically all respects: accuracy, computational efficiency (both speed and memory usage), ease-of-use, parallelizability, and reliability. However, the classical procedures remain the methods of choice for estimating spectral norms and are far superior for calculating the least singular values and corresponding singular vectors (or singular subspaces).

Skip Supplemental Material Section

Supplemental Material

References

  1. Alliance for Telecommunications Industry Solutions Committee PRQC. 2011. ATIS Telecom Glossary, American National Standard T1.523. Alliance for Telecommunications Industry Solutions (ATIS), American National Standards Institute (ANSI), Washington, DC.Google ScholarGoogle Scholar
  2. Edward Anderson, Zhaojun Bai, Christian Bischof, Laura Susan Blackford, James Demmel, Jack Dongarra, Jeremy Du Croz, Anne Greenbaum, Sven Hammarling, Alan McKenney, and Daniel Sorensen. 1999. LAPACK User’s Guide. SIAM, Philadelphia, PA.Google ScholarGoogle Scholar
  3. Haim Avron, Costas Bekas, Christos Boutsidis, Kenneth Clarkson, Prabhanjan Kambadur, Giorgos Kollias, Michael Mahoney, Ilse Ipsen, Yves Ineichen, Vikas Sindhwani, and David Woodruff. 2014. LibSkylark: Sketching-Based Matrix Computations for Machine Learning. IBM Research, in collaboration with Bloomberg Labs, NCSU, Stanford, UC Berkeley, and Yahoo Labs. Retrieved from http://xdata-skylark.github.io/libskylark.Google ScholarGoogle Scholar
  4. Michael Berry, Dany Mezher, Bernard Philippe, and Ahmed Sameh. 2003. Parallel computation of the singular value decomposition. Research report RR-4694, INRIA.Google ScholarGoogle Scholar
  5. Timothy A. Davis and Yifan Hu. 2011. The university of Florida sparse matrix collection. ACM Trans. Math. Softw. 38, 1 (2011), 1:1--1:25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Leslie V. Foster and Timothy A. Davis. 2013. Algorithm 933: Reliable calculation of numerical rank, null space bases, pseudoinverse solutions, and basic solutions using SuiteSparseQR. ACM Trans. Math. Softw. 40, 1 (Sep. 2013), 1--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gene Golub and Charles Van Loan. 2012. Matrix Computations (4th ed.). Johns Hopkins University Press.Google ScholarGoogle Scholar
  8. Nathan Halko, Per-Gunnar Martinsson, and Joel Tropp. 2011. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53, 2 (2011), 217--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jacek Kuczyński and Henryk Woźniakowski. 1992. Estimating the largest eigenvalue by the power and Lanczos algorithms with a random start. SIAM J. Matrix Anal. Appl. 13, 4 (1992), 1094--1122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Rasmus Larsen. 2001. Combining implicit restart and partial reorthogonalization in Lanczos bidiagonalization. Presentation at U.C. Berkeley, sponsored by Stanford’s Scientific Computing and Computational Mathematics (succeeded by the Institute for Computational and Mathematical Engineering). Retrieved from http://sun.stanford.edu/∼rmunk/PROPACK/talk.rev3.pdf.Google ScholarGoogle Scholar
  11. Richard Lehoucq, Daniel Sorensen, and Chao Yang. 1998. ARPACK User’s Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods. SIAM, Philadelphia, PA.Google ScholarGoogle Scholar
  12. Per-Gunnar Martinsson and Sergey Voronin. 2015. A randomized blocked algorithm for efficiently computing rank-revealing factorizations of matrices. 1--12.Google ScholarGoogle Scholar
  13. Frank McSherry and Dimitris Achlioptas. 2007. Fast computation of low-rank matrix approximations. J. ACM 54, 2 (Apr. 2007), 1--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Srinivas Rachakonda, Rogers F. Silva, Jingyu Liu, and Vince Calhoun. 2014. Memory-efficient PCA approaches for large-group ICA. (2014). fMRI Toolbox, Medical Image Analysis Laboratory, University of New Mexico.Google ScholarGoogle Scholar
  15. Gil Shabat, Yaniv Shmueli, and Amir Averbuch. 2013. Randomized LU Decomposition. Technical Report 1310.7202. arXiv.Google ScholarGoogle Scholar
  16. Rafi Witten and Emmanuel Candès. 2015. Randomized algorithms for low-rank matrix factorizations: Sharp performance bounds. Algorithmica 72, 1 (May 2015), 264--281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Herman Wold. 1966. Estimation of principal components and related models by iterative least squares. In Multivariate Analysis, Parachuri R. Krishnaiaah (Ed.). Academic Press, 391--420.Google ScholarGoogle Scholar
  18. David Woodruff. 2014. Sketching as a Tool for Numerical Linear Algebra. Foundations and Trends in Theoretical Computer Science, Vol. 10. Now Publishers. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Algorithm 971: An Implementation of a Randomized Algorithm for Principal Component Analysis

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Mathematical Software
      ACM Transactions on Mathematical Software  Volume 43, Issue 3
      September 2017
      232 pages
      ISSN:0098-3500
      EISSN:1557-7295
      DOI:10.1145/2988516
      Issue’s Table of Contents

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 January 2017
      • Revised: 1 September 2016
      • Accepted: 1 September 2016
      • Received: 1 December 2014
      Published in toms Volume 43, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader