TuckerMPI,ACM Transactions on Mathematical Software

当前位置： X-MOL 学术 › ACM Trans. Math. Softw. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

TuckerMPI
ACM Transactions on Mathematical Software ( IF 2.7 ) Pub Date : 2020-06-01 , DOI: 10.1145/3378445
Grey Ballard ₁ , Alicia Klinvex ₂ , Tamara G. Kolda ₂

Affiliation

Our goal is compression of massive-scale grid-structured data, such as the multi-terabyte output of a high-fidelity computational simulation. For such data sets, we have developed a new software package called TuckerMPI, a parallel C++/MPI software package for compressing distributed data. The approach is based on treating the data as a tensor, i.e., a multidimensional array, and computing its truncated Tucker decomposition, a higher-order analogue to the truncated singular value decomposition of a matrix. The result is a low-rank approximation of the original tensor-structured data. Compression efficiency is achieved by detecting latent global structure within the data, which we contrast to most compression methods that are focused on local structure. In this work, we describe TuckerMPI, our implementation of the truncated Tucker decomposition, including details of the data distribution and in-memory layouts, the parallel and serial implementations of the key kernels, and analysis of the storage, communication, and computational costs. We test the software on 4.5 and 6.7 terabyte data sets distributed across 100 s of nodes (1,000 s of MPI processes), achieving compression ratios between 100 and 200,000×, which equates to 99--99.999% compression (depending on the desired accuracy) in substantially less time than it would take to even read the same dataset from a parallel file system. Moreover, we show that our method also allows for reconstruction of partial or down-sampled data on a single node, without a parallel computer so long as the reconstructed portion is small enough to fit on a single machine, e.g., in the instance of reconstructing/visualizing a single down-sampled time step or computing summary statistics. The code is available at https://gitlab.com/tensors/TuckerMPI.

中文翻译：

塔克MPI

我们的目标是压缩大规模网格结构数据，例如高保真计算模拟的数 TB 输出。对于此类数据集，我们开发了一个名为 TuckerMPI 的新软件包，这是一种用于压缩分布式数据的并行 C++/MPI 软件包。该方法基于将数据视为张量，即多维数组，并计算其截断的 Tucker 分解，这是矩阵的截断奇异值分解的高阶模拟。结果是原始张量结构数据的低秩近似。压缩效率是通过检测数据中的潜在全局结构来实现的，这与大多数专注于局部结构的压缩方法形成对比。在这项工作中，我们描述了 TuckerMPI，我们实现截断的 Tucker 分解，包括数据分布和内存布局的细节，关键内核的并行和串行实现，以及存储、通信和计算成本的分析。我们在分布在 100 个节点（1000 个 MPI 进程）上的 4.5 和 6.7 TB 数据集上测试该软件，实现了 100 到 200,000 倍之间的压缩比，这相当于 99--99.999% 的压缩率（取决于所需的精度）比从并行文件系统读取相同数据集所需的时间要少得多。此外，我们表明我们的方法还允许在单个节点上重建部分或下采样数据，而无需并行计算机，只要重建的部分足够小以适合单个机器，例如，在重建/可视化单个下采样时间步或计算汇总统计数据的情况下。该代码可在 https://gitlab.com/tensors/TuckerMPI 获得。

更新日期：2020-06-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>