当前位置: X-MOL 学术arXiv.cs.PF › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ALTO: Adaptive Linearized Storage of Sparse Tensors
arXiv - CS - Performance Pub Date : 2021-02-20 , DOI: arxiv-2102.10245
Ahmed E. Helal, Jan Laukemann, Fabio Checconi, Jesmin Jahan Tithi, Teresa Ranadive, Fabrizio Petrini, Jeewhan Choi

The analysis of high-dimensional sparse data is becoming increasingly popular in many important domains. However, real-world sparse tensors are challenging to process due to their irregular shapes and data distributions. We propose the Adaptive Linearized Tensor Order (ALTO) format, a novel mode-agnostic (general) representation that keeps neighboring nonzero elements in the multi-dimensional space close to each other in memory. To generate the indexing metadata, ALTO uses an adaptive bit encoding scheme that trades off index computations for lower memory usage and more effective use of memory bandwidth. Moreover, by decoupling its sparse representation from the irregular spatial distribution of nonzero elements, ALTO eliminates the workload imbalance and greatly reduces the synchronization overhead of tensor computations. As a result, the parallel performance of ALTO-based tensor operations becomes a function of their inherent data reuse. On a gamut of tensor datasets, ALTO outperforms an oracle that selects the best state-of-the-art format for each dataset, when used in key tensor decomposition operations. Specifically, ALTO achieves a geometric mean speedup of 8X over the best mode-agnostic format, while delivering a geometric mean compression ratio of more than 4X relative to the best mode-specific format.

中文翻译:

ALTO:稀疏张量的自适应线性存储

在许多重要领域中,高维稀疏数据的分析变得越来越流行。但是,由于稀疏张量的形状和数据分布不规则,因此稀疏张量在处理上具有挑战性。我们提出了自适应线性化张量阶数(ALTO)格式,这是一种新颖的模式不可知(一般)表示形式,可以使多维空间中的相邻非零元素在内存中保持彼此靠近。为了生成索引元数据,ALTO使用一种自适应位编码方案,该方案需要权衡索引计算以降低内存使用量并更有效地使用内存带宽。此外,通过将其稀疏表示与非零元素的不规则空间分布解耦,ALTO消除了工作负载不平衡,并大大减少了张量计算的同步开销。因此,基于ALTO的张量运算的并行性能成为其固有数据重用的函数。在张量数据集的全部范围内,ALTO在用于关键张量分解操作时优于为每个数据集选择最佳最新格式的Oracle。特别是,ALTO在与最佳模式无关的格式上实现了8倍的几何平均加速,同时相对于最佳模式特定格式提供了4倍以上的几何平均压缩比。
更新日期:2021-02-23
down
wechat
bug