当前位置:
X-MOL 学术
›
arXiv.cs.PF
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
A Parallel Sparse Tensor Benchmark Suite on CPUs and GPUs
arXiv - CS - Performance Pub Date : 2020-01-02 , DOI: arxiv-2001.00660 Jiajia Li and Mahesh Lakshminarasimhan and Xiaolong Wu and Ang Li and Catherine Olschanowsky and Kevin Barker
arXiv - CS - Performance Pub Date : 2020-01-02 , DOI: arxiv-2001.00660 Jiajia Li and Mahesh Lakshminarasimhan and Xiaolong Wu and Ang Li and Catherine Olschanowsky and Kevin Barker
Tensor computations present significant performance challenges that impact a
wide spectrum of applications ranging from machine learning, healthcare
analytics, social network analysis, data mining to quantum chemistry and signal
processing. Efforts to improve the performance of tensor computations include
exploring data layout, execution scheduling, and parallelism in common tensor
kernels. This work presents a benchmark suite for arbitrary-order sparse tensor
kernels using state-of-the-art tensor formats: coordinate (COO) and
hierarchical coordinate (HiCOO) on CPUs and GPUs. It presents a set of
reference tensor kernel implementations that are compatible with real-world
tensors and power law tensors extended from synthetic graph generation
techniques. We also propose Roofline performance models for these kernels to
provide insights of computer platforms from sparse tensor view.
中文翻译:
CPU 和 GPU 上的并行稀疏张量基准测试套件
张量计算带来了重大的性能挑战,影响了从机器学习、医疗保健分析、社交网络分析、数据挖掘到量子化学和信号处理的广泛应用。提高张量计算性能的努力包括探索常见张量内核中的数据布局、执行调度和并行性。这项工作为使用最先进的张量格式的任意阶稀疏张量内核提供了一个基准套件:CPU 和 GPU 上的坐标 (COO) 和分层坐标 (HiCOO)。它提供了一组参考张量内核实现,这些实现与从合成图生成技术扩展的真实世界张量和幂律张量兼容。
更新日期:2020-01-06
中文翻译:
CPU 和 GPU 上的并行稀疏张量基准测试套件
张量计算带来了重大的性能挑战,影响了从机器学习、医疗保健分析、社交网络分析、数据挖掘到量子化学和信号处理的广泛应用。提高张量计算性能的努力包括探索常见张量内核中的数据布局、执行调度和并行性。这项工作为使用最先进的张量格式的任意阶稀疏张量内核提供了一个基准套件:CPU 和 GPU 上的坐标 (COO) 和分层坐标 (HiCOO)。它提供了一组参考张量内核实现,这些实现与从合成图生成技术扩展的真实世界张量和幂律张量兼容。