当前位置: X-MOL 学术arXiv.cs.MS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accelerating the SpMV kernel on standard CPUs by exploiting the partially diagonal structures
arXiv - CS - Mathematical Software Pub Date : 2021-05-11 , DOI: arxiv-2105.04937
Takeshi Fukaya, Koki Ishida, Akie Miura, Takeshi Iwashita, Hiroshi Nakashima

Sparse Matrix Vector multiplication (SpMV) is one of basic building blocks in scientific computing, and acceleration of SpMV has been continuously required. In this research, we aim for accelerating SpMV on recent CPUs for sparse matrices that have a specific sparsity structure, namely a diagonally structured sparsity pattern. We focus a hybrid storage format that combines the DIA and CSR formats, so-called the HDC format. First, we recall the importance of introducing cache blocking techniques into HDC-based SpMV kernels. Next, based on the observation of the cache blocked kernel, we present a modified version of the HDC formats, which we call the M-HDC format, in which partial diagonal structures are expected to be more efficiently picked up. For these SpMV kernels, we theoretically analyze the expected performance improvement based on performance models. Then, we conduct comprehensive experiments on state-of-the-art multi-core CPUs. By the experiments using typical matrices, we clarify the detailed performance characteristics of each SpMV kernel. We also evaluate the performance for matrices appearing in practical applications and demonstrate that our approach can accelerate SpMV for some of them. Through the present paper, we demonstrate the effectiveness of exploiting partial diagonal structures by the M-HDC format as a promising approach to accelerating SpMV on CPUs for a certain kind of practical sparse matrices.

中文翻译:

通过利用部分对角线结构在标准CPU上加速SpMV内核

稀疏矩阵向量乘法(SpMV)是科学计算的基本组成部分之一,并且不断需要SpMV加速。在这项研究中,我们旨在在具有特定稀疏结构(即对角结构稀疏模式)的稀疏矩阵上,在最近的CPU上加速SpMV。我们关注混合了DIA和CSR格式的混合存储格式,即所谓的HDC格式。首先,我们回顾了将缓存阻止技术引入基于HDC的SpMV内核的重要性。接下来,基于对高速缓存阻止的内核的观察,我们提出了HDC格式的修改版本,我们称其为M-HDC格式,其中有望更有效地拾取部分对角线结构。对于这些SpMV内核,我们从理论上基于性能模型分析了预期的性能改进。然后,我们在最先进的多核CPU上进行全面的实验。通过使用典型矩阵进行的实验,我们阐明了每个SpMV内核的详细性能特征。我们还评估了实际应用中出现的矩阵的性能,并证明了我们的方法可以加速其中某些矩阵的SpMV。通过本文,我们证明了通过M-HDC格式利用部分对角线结构的有效性,作为一种对某些实用的稀疏矩阵加速CPU上的SpMV的有前途的方法。我们阐明了每个SpMV内核的详细性能特征。我们还评估了实际应用中出现的矩阵的性能,并证明了我们的方法可以加速其中某些矩阵的SpMV。通过本文,我们证明了通过M-HDC格式利用部分对角线结构的有效性,作为一种对某些实用的稀疏矩阵加速CPU上的SpMV的有前途的方法。我们阐明了每个SpMV内核的详细性能特征。我们还评估了实际应用中出现的矩阵的性能,并证明了我们的方法可以加速其中某些矩阵的SpMV。通过本文,我们证明了通过M-HDC格式利用部分对角线结构的有效性,作为一种对某些实用的稀疏矩阵加速CPU上的SpMV的有前途的方法。
更新日期:2021-05-12
down
wechat
bug