A Parallel Sparse Tensor Benchmark Suite on CPUs and GPUs,arXiv - CS - Performance

当前位置： X-MOL 学术 › arXiv.cs.PF › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Parallel Sparse Tensor Benchmark Suite on CPUs and GPUs
arXiv - CS - Performance Pub Date : 2020-01-02 , DOI: arxiv-2001.00660
Jiajia Li and Mahesh Lakshminarasimhan and Xiaolong Wu and Ang Li and Catherine Olschanowsky and Kevin Barker

Tensor computations present significant performance challenges that impact a wide spectrum of applications ranging from machine learning, healthcare analytics, social network analysis, data mining to quantum chemistry and signal processing. Efforts to improve the performance of tensor computations include exploring data layout, execution scheduling, and parallelism in common tensor kernels. This work presents a benchmark suite for arbitrary-order sparse tensor kernels using state-of-the-art tensor formats: coordinate (COO) and hierarchical coordinate (HiCOO) on CPUs and GPUs. It presents a set of reference tensor kernel implementations that are compatible with real-world tensors and power law tensors extended from synthetic graph generation techniques. We also propose Roofline performance models for these kernels to provide insights of computer platforms from sparse tensor view.

中文翻译：

CPU 和 GPU 上的并行稀疏张量基准测试套件

张量计算带来了重大的性能挑战，影响了从机器学习、医疗保健分析、社交网络分析、数据挖掘到量子化学和信号处理的广泛应用。提高张量计算性能的努力包括探索常见张量内核中的数据布局、执行调度和并行性。这项工作为使用最先进的张量格式的任意阶稀疏张量内核提供了一个基准套件：CPU 和 GPU 上的坐标 (COO) 和分层坐标 (HiCOO)。它提供了一组参考张量内核实现，这些实现与从合成图生成技术扩展的真实世界张量和幂律张量兼容。

更新日期：2020-01-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文