Block tensor train decomposition for missing data estimation,Statistical Papers

当前位置： X-MOL 学术 › Stat. Pap. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Block tensor train decomposition for missing data estimation
Statistical Papers ( IF 1.2 ) Pub Date : 2018-09-06 , DOI: 10.1007/s00362-018-1043-8
Namgil Lee , Jong-Min Kim

We propose a method for imputation of missing values in large scale matrix data based on a low-rank tensor approximation technique called the block tensor train (BTT) decomposition. Given sparsely observed data points, the proposed method iteratively computes the singular value decomposition (SVD) of the underlying data matrix with missing values. The SVD of the matrices is performed based on a low-rank BTT decomposition, by which storage and time complexities can be reduced dramatically for large-scale data matrices admitting a low-rank tensor structure. An iterative soft-thresholding algorithm is implemented for missing data estimation based on an alternating least squares method for BTT decomposition. Experimental results on simulated data and real benchmark data demonstrate that the proposed method can estimate a large amount of missing values accurately compared to a matrix-based standard method. The R source code of the BTT-based imputation method is available at https://github.com/namgillee/BTTSoftImpute.

中文翻译：

用于缺失数据估计的块张量训练分解

我们提出了一种基于称为块张量序列 (BTT) 分解的低秩张量近似技术在大规模矩阵数据中插补缺失值的方法。给定稀疏观察到的数据点，所提出的方法迭代计算具有缺失值的基础数据矩阵的奇异值分解 (SVD)。矩阵的 SVD 是基于低秩 BTT 分解执行的，对于允许低秩张量结构的大规模数据矩阵，通过这种分解可以显着降低存储和时间复杂度。基于用于BTT分解的交替最小二乘法，实现了一种用于缺失数据估计的迭代软阈值算法。在模拟数据和真实基准数据上的实验结果表明，与基于矩阵的标准方法相比，所提出的方法可以准确估计大量缺失值。基于 BTT 的插补方法的 R 源代码可在 https://github.com/namgillee/BTTSoftImpute 获得。

更新日期：2018-09-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11