当前位置: X-MOL 学术SIAM J. Sci. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparison of Accuracy and Scalability of Gauss--Newton and Alternating Least Squares for CANDECOMC/PARAFAC Decomposition
SIAM Journal on Scientific Computing ( IF 3.1 ) Pub Date : 2021-08-04 , DOI: 10.1137/20m1344561
Navjot Singh , Linjian Ma , Hongru Yang , Edgar Solomonik

SIAM Journal on Scientific Computing, Volume 43, Issue 4, Page C290-C311, January 2021.
Alternating least squares is the most widely used algorithm for CANDECOMC/PARAFAC (CP) tensor decomposition. However, alternating least squares may exhibit slow or no convergence, especially when high accuracy is required. An alternative approach is to regard CP decomposition as a nonlinear least squares problem and employ Newton-like methods. Direct solution of linear systems involving an approximated Hessian is generally expensive. However, recent advancements have shown that use of an implicit representation of the linear system makes these methods competitive with alternating least squares (ALS). We provide the first parallel implementation of a Gauss--Newton method for CP decomposition, which iteratively solves linear least squares problems at each Gauss--Newton step. In particular, we leverage a formulation that employs tensor contractions for implicit matrix-vector products within the conjugate gradient method. The use of tensor contractions enables us to employ the Cyclops library for distributed-memory tensor computations to parallelize the Gauss--Newton approach with a high-level Python implementation. In addition, we propose a regularization scheme for the Gauss--Newton method to improve convergence properties without any additional cost. We study the convergence of variants of the Gauss--Newton method relative to ALS for finding exact CP decompositions as well as approximate decompositions of real-world tensors. We evaluate the performance of sequential and parallel versions of both approaches, and study the parallel scalability on the Stampede2 supercomputer.


中文翻译:

高斯精度和可扩展性的比较--牛顿和交替最小二乘法用于 CANDECOMC/PARAFAC 分解

SIAM 科学计算杂志,第 43 卷,第 4 期,第 C290-C311 页,2021 年 1 月。
交替最小二乘法是 CANDECOMC/PARAFAC (CP) 张量分解中使用最广泛的算法。然而,交替最小二乘可能会表现出缓慢或不收敛,尤其是在需要高精度时。另一种方法是将 CP 分解视为非线性最小二乘问题并采用类牛顿方法。涉及近似 Hessian 的线性系统的直接求解通常很昂贵。然而,最近的进展表明,使用线性系统的隐式表示使这些方法与交替最小二乘法 (ALS) 竞争。我们提供了用于 CP 分解的高斯牛顿法的第一个并行实现,它在每个高斯牛顿步上迭代地解决线性最小二乘问题。特别是,我们利用一个公式,该公式在共轭梯度方法中对隐式矩阵向量乘积使用张量收缩。张量收缩的使用使我们能够使用 Cyclops 库进行分布式内存张量计算,以将高斯-牛顿方法与高级 Python 实现并行化。此外,我们为高斯-牛顿法提出了一种正则化方案,以在不增加任何额外成本的情况下提高收敛性。我们研究了高斯-牛顿法相对于 ALS 的变体的收敛性,以找到精确的 CP 分解以及现实世界张量的近似分解。我们评估了这两种方法的顺序和并行版本的性能,并研究了 Stampede2 超级计算机上的并行可扩展性。
更新日期:2021-08-05
down
wechat
bug