Block tensor train decomposition for missing data estimation

Lee, Namgil; Kim, Jong-Min

doi:10.1007/s00362-018-1043-8

Block tensor train decomposition for missing data estimation

Regular Article
Published: 06 September 2018

Volume 59, pages 1283–1305, (2018)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Namgil Lee¹ &
Jong-Min Kim²

366 Accesses
2 Citations
Explore all metrics

Abstract

We propose a method for imputation of missing values in large scale matrix data based on a low-rank tensor approximation technique called the block tensor train (BTT) decomposition. Given sparsely observed data points, the proposed method iteratively computes the singular value decomposition (SVD) of the underlying data matrix with missing values. The SVD of the matrices is performed based on a low-rank BTT decomposition, by which storage and time complexities can be reduced dramatically for large-scale data matrices admitting a low-rank tensor structure. An iterative soft-thresholding algorithm is implemented for missing data estimation based on an alternating least squares method for BTT decomposition. Experimental results on simulated data and real benchmark data demonstrate that the proposed method can estimate a large amount of missing values accurately compared to a matrix-based standard method. The R source code of the BTT-based imputation method is available at https://github.com/namgillee/BTTSoftImpute.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tensor completion via bilevel minimization with fixed-point constraint to estimate missing elements in noisy data

Article 28 January 2021

Clustering on Multi-source Incomplete Data via Tensor Modeling and Factorization

Low-Rank Tensor Completion Using Matrix Factorization Based on Tensor Train Rank and Total Variation

Article 30 August 2019

References

Acar E, Dunlavy DM, Kolda TG, Mørup M (2011) Scalable tensor factorizations for incomplete data. Chemom Intell Lab Syst 106(1):41–56
Article Google Scholar
Batselier K, Yu W, Daniel L, Wong N (2018) Computing low-rank approximations of large-scale matrices with the tensor network randomized SVD. SIAM J Matrix Anal Appl 39(3):1221–1244
Article MathSciNet Google Scholar
Bengua JA, Phien HN, Tuan HD, Do MN (2017) Efficient tensor completion for color image and video recovery: low-rank tensor train. IEEE Trans Image Process 26(5):2466–2479
Article MathSciNet Google Scholar
Bennett J, Lanning, S (2007) The Netflix prize. In: Proceedings of KDD cup and workshop 2007. www.netflixprize.com
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Book Google Scholar
Candes EJ, Plan Y (2010) Matrix completion with noise. Proc IEEE 98(6):925–936
Article Google Scholar
Candès EJ, Recht B (2009) Exact matrix completion via convex optimization. Found Comput Math 9:717–772
Article MathSciNet Google Scholar
Candès EJ, Tao T (2009) The power of convex relaxation: near-optimal matrix completion. IEEE Trans Inf Theory 56(5):2053–2080
Article MathSciNet Google Scholar
Carroll JD, Chang J-J (1970) Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition. Psychometrika 35(3):283–319
Article Google Scholar
Chen Y-L, Hsu C-T, Liao H-YM (2014) Simultaneous tensor decomposition and completion using factor priors. IEEE Trans Pattern Anal Mach Intell 36(3):577–591
Article Google Scholar
Da Silva C, Herrmann FJ (2013) Hierarchical Tucker tensor optimization—applications to tensor completion. In: Proceedings of the 10th international conference on sampling theory and applications, pp 384–387
Debals O, De Lathauwer, L (2015) Stochastic and deterministic tensorization for blind signal separation. In: Vincent E, Yeredor A, Koldovsky Z, Tichavský P (eds) Proceedings of the 12th international conference on latent variable analysis and signal separation, pp 3–13. Springer International Publishing
Dolgov SV, Savostyanov DV (2014) Alternating minimal energy methods for linear systems in higher dimensions. SIAM J Sci Comput 36(5):A2248–A2271
Article MathSciNet Google Scholar
Dolgov SV, Khoromskij BN, Oseledets IV (2012) Fast solution of parabolic problems in the tensor train/quantized tensor train format with initial application to the Fokker-Planck equation. SIAM J Sci Comput 34(6):A3016–A3038
Article MathSciNet Google Scholar
Dolgov SV, Khoromskij BN, Oseledets IV, Savostyanov DV (2014) Computation of extreme eigenvalues in higher dimensions using block tensor train format. Comput Phys Commun 185(4):1207–1216
Article Google Scholar
Enders CK (2010) Applied missing data analysis. Guilford Press, New York
Google Scholar
Falcó A, Hackbusch W (2012) On minimal subspaces in tensor representations. Found Comput Math 12(6):765–803
Article MathSciNet Google Scholar
Fazel M (2002) Matrix rank minimization with applications. PhD thesis. Stanford University, Stanford
Filipović M, Jukić A (2015) Tucker factorization with missing data with application to low-$n$-rank tensor completion. Multidimens Syst Signal Process 26(3):677–692
Article Google Scholar
Friedland S, Lim L-H (2018) Nuclear norm of higher-order tensors. Math Comput 87:1255–1281
Article MathSciNet Google Scholar
Gabriel K, Zamir S (1979) Lower rank approximation of matrices by least squares with any choice of weights. Technometrics 21:236–246
Article Google Scholar
Gandy S, Recht B, Yamada I (2011) Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Probl 27(2):025010
Article MathSciNet Google Scholar
Gillis N, Glineur F (2011) Low-rank matrix approximation with weights or missing data is NP-hard. SIAM J Matrix Anal Appl 32(4):1149–1165
Article MathSciNet Google Scholar
Graham JW (2009) Missing data analysis: making it work in the real world. Annu Rev Psychol 60(1):549–576
Article Google Scholar
Grasedyck L (2010) Hierarchical singular value decomposition of tensors. SIAM J Matrix Anal Appl 31(4):2029–2054
Article MathSciNet Google Scholar
Grasedyck L, Kressner D, Tobler C (2013) A literature survey of low-rank tensor approximation techniques. GAMM-Mitteilungen 36:53–78
Article MathSciNet Google Scholar
Grasedyck L, Kluge M, Krämer S (2015) Variants of alternating least squares tensor completion in the tensor train format. SIAM J Sci Comput 37(5):A2424–A2450
Article MathSciNet Google Scholar
Guillemot C, Le Meur O (2014) Image inpainting: overview and recent advances. IEEE Signal Process Mag 31(1):127–144
Article Google Scholar
Guo X, Ma Y (2015) Generalized tensor total variation minimization for visual data recovery. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3603–3611
Hackbusch W (2012) Tensor spaces and numerical tensor calculus. Springer, Berlin
Book Google Scholar
Hackbusch W, Kühn S (2009) A new scheme for the tensor representation. J Fourier Anal Appl 15(5):706–722
Article MathSciNet Google Scholar
Harshman RA (1970) Foundations of the PARAFAC procedure: models and conditions for an “explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics, vol, 16, pp 1–84. http://publish.uwo.ca/~harshman/wpppfac0.pdf
Holtz S, Rohwedder T, Schneider R (2012a) The alternating linear scheme for tensor optimization in the tensor train format. SIAM J Sci Comput 34(2):A683–A713
Article MathSciNet Google Scholar
Holtz S, Rohwedder T, Schneider R (2012b) On manifolds of tensors of fixed tt-rank. Numerische Mathematik 120(4):701–731
Article MathSciNet Google Scholar
Huber B, Schneider R, Wolf S (2017) A randomized tensor train singular value decomposition. In: Boche H, Caire G, Calderbank R, März M, Kutyniok G, Mathar R (eds) Compressed sensing and its applications. Applied and numerical harmonic analysis. Birkhäuser, Cham, pp 261–290
Chapter Google Scholar
Jannach D, Zanker M, Felfernig A, Friedrich G (2010) Recommender systems: an introduction, 1st edn. Cambridge University Press, Cambridge. ISBN 0521493366
Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New York
MATH Google Scholar
Karlsson L, Kressner D, Uschmajew A (2016) Parallel algorithms for tensor completion in the CP format. Parallel Comput 57:222–234
Article MathSciNet Google Scholar
Kasai H, Mishra B (2016) Low-rank tensor completion: a Riemannian manifold preconditioning approach. In: Balcan MF, Weinberger KQ (eds) Proceedings of the 33rd international conference on machine learning, volume 48 of Proceedings of machine learning research (PMLR), pp 1012–1021, New York, NY, USA
Khoromskij BN (2011) ${O}(d{\rm log}~N)$-quantics approximation of ${N}$-$d$ tensors in high-dimensional numerical modeling. Constr Approx 34(2):257–280
Article MathSciNet Google Scholar
Khoromskij BN (2012) Tensors-structured numerical methods in scientific computing: survey on recent advances. Chemometr Intell Lab Syst 110:1–19
Article Google Scholar
Kiers HAL (1997) Weighted least squares fitting using ordinary least squares algorithms. Psychometrika 62(2):251–266
Article MathSciNet Google Scholar
Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51:455–500
Article MathSciNet Google Scholar
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37
Article Google Scholar
Kressner D, Steinlechner M, Uschmajew A (2014a) Low-rank tensor methods with subspace correction for symmetric eigenvalue problems. SIAM J Sci Comput 36(5):A2346–A2368
Article MathSciNet Google Scholar
Kressner D, Steinlechner M, Vandereycken B (2014b) Low-rank tensor completion by Riemannian optimization. BIT Numer Math 54(2):447–468
Article MathSciNet Google Scholar
Lebedeva OS (2011) Tensor conjugate-gradient-type method for Rayleigh quotient minimization in block QTT-format. Russ J Numer Anal Math Model 26:465–489
Article MathSciNet Google Scholar
Lee N, Cichocki A (2015) Estimating a few extreme singular values and vectors for large-scale matrices in tensor train format. SIAM J Matrix Anal Appl 36(3):994–1014
Article MathSciNet Google Scholar
Lee N, Cichocki A (2016) Regularized computation of approximate pseudoinverse of large matrices using low-rank tensor train decompositions. SIAM J Matrix Anal Appl 37(2):598–623
Article MathSciNet Google Scholar
Lee N, Cichocki A (2018) Fundamental tensor operations for large-scale data analysis using tensor network formats. Multidimens Syst Signal Process 29(3):921–960
Article MathSciNet Google Scholar
Liu J, Musialski P, Wonka P, Ye J (2013) Tensor completion for estimating missing values in visual data. IEEE Trans Pattern Anal Mach Intell 35(1):208–220
Article Google Scholar
Mazumder R, Hastie T, Tibshirani R (2010) Spectral regularization algorithms for learning large incomplete matrices. J Mach Learn Res 11:2287–2322
MathSciNet MATH Google Scholar
Oseledets I, Tyrtyshnikov E (2009) Breaking the curse of dimensionality, or how to use SVD in many dimensions. SIAM J Sci Comput 31(5):3744–3759
Article MathSciNet Google Scholar
Oseledets IV (2011) Tensor-train decomposition. SIAM J Sci Comput 33(5):2295–2317
Article MathSciNet Google Scholar
Oseledets IV, Dolgov SV (2012) Solution of linear systems and matrix inversion in the TT-format. SIAM J Sci Comput 34(5):A2718–A2739
Article MathSciNet Google Scholar
Rai P, Wang Y, Guo S, Chen G, Dunson D, Carin L (2014) Scalable Bayesian low-rank decomposition of incomplete multiway tensors. In Xing EP, Jebara T (eds) Proceedings of the 31st international conference on machine learning, volume 32 of Proceedings of machine learning research (PMLR), pp 1800–1808
Rauhut H, Schneider R, Stojanac Ž (2015) Tensor completion in hierarchical tensor representations. In: Boche H, Calderbank R, Kutyniok G, Vybíral J (eds) Compressed sensing and its applications. Applied and numerical harmonic analysis. Birkhäuser, Cham, pp 419–450
Chapter Google Scholar
Recht B, Fazel M, Parrilo PA (2010) Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev 52(3):471–501
Article MathSciNet Google Scholar
Ricci F, Rokach L, Shapira B (2015) Recommender systems handbook, 2nd edn. Springer, Boston
Book Google Scholar
Saad Y (2011) Numerical methods for large eigenvalue problems. Classics in applied mathematics. SIAM, Philadelphia, vol 66 (revised edition)
Steinlechner M (2016) Riemannian optimization for high-dimensional tensor completion. SIAM J Sci Comput 38(5):S461–S484
Article MathSciNet Google Scholar
Tucker LR (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31(3):279–311
Article MathSciNet Google Scholar
van Buuren S (2012) Flexible imputation of missing data. Interdisciplinary statistics series. Chapman and Hall/CRC, New York
Book Google Scholar
Vervliet N, Debals O, Sorber L, De Lathauwer L (2014) Breaking the curse of dimensionality using decompositions of incomplete tensors: tensor-based scientific computing in big data analysis. IEEE Signal Process Mag 31(5):71–79
Article Google Scholar
Yamaguchi Y, Hayashi K (2017) Tensor decomposition with missing indices. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI-17), pp 3217–3223
Yokota T, Cichocki A (2016) Tensor completion via functional smooth component deflation. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2514–2518
Yokota T, Zhao Q, Cichocki A (2016) Smooth PARAFAC decomposition for tensor completion. IEEE Trans Signal Process 64(20):5423–5436
Article MathSciNet Google Scholar
Yuan M, Zhang C-H (2016) On tensor completion via nuclear norm minimization. Found Comput Math 16(4):1031–1068
Article MathSciNet Google Scholar
Yuan L, Zhao Q, Cao J (2017) Completion of high order tensor data with missing entries via tensor-train decomposition. In: Liu D, Xie S, Li Y, Zhao D, El-Alfy ESM (eds) Neural Inf Process. ICONIP 2017, volume 10634 of Lecture notes in computer science. Springer, Cham, pp 222–229
Chapter Google Scholar
Zhao Q, Zhang L, Cichocki A (2015a) Bayesian CP factorization of incomplete tensors with automatic rank determination. IEEE Trans Pattern Anal Mach Intell 37(9):1751–1763
Article Google Scholar
Zhao Q, Zhang L, Cichocki A (2015b) Bayesian sparse Tucker models for dimension reduction and tensor completion. arXiv:1505.02343
Zhao Q, Zhou G, Zhang L, Cichocki A, Amari SI (2016) Bayesian robust tensor factorization for incomplete multiway data. IEEE Trans Neural Netw Learn Syst 27(4):736–748
Article MathSciNet Google Scholar

Download references

Acknowledgements

This study was supported by 2017 Research Grant from Kangwon National University and by a National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2017R1C1B5076912).

Author information

Authors and Affiliations

Department of Information Statistics, Kangwon National University, Chuncheon, Gangwon, 24341, South Korea
Namgil Lee
Statistics Discipline, Division of Sciences and Mathematics, University of Minnesota-Morris, Morris, MN, 56267-2134, USA
Jong-Min Kim

Authors

Namgil Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jong-Min Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jong-Min Kim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, N., Kim, JM. Block tensor train decomposition for missing data estimation. Stat Papers 59, 1283–1305 (2018). https://doi.org/10.1007/s00362-018-1043-8

Download citation

Received: 27 January 2018
Revised: 30 August 2018
Published: 06 September 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s00362-018-1043-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Block tensor train decomposition for missing data estimation

Abstract

Access this article

Similar content being viewed by others

Tensor completion via bilevel minimization with fixed-point constraint to estimate missing elements in noisy data

Clustering on Multi-source Incomplete Data via Tensor Modeling and Factorization

Low-Rank Tensor Completion Using Matrix Factorization Based on Tensor Train Rank and Total Variation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Block tensor train decomposition for missing data estimation

Abstract

Access this article

Similar content being viewed by others

Tensor completion via bilevel minimization with fixed-point constraint to estimate missing elements in noisy data

Clustering on Multi-source Incomplete Data via Tensor Modeling and Factorization

Low-Rank Tensor Completion Using Matrix Factorization Based on Tensor Train Rank and Total Variation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation