当前位置: X-MOL 学术arXiv.cs.DC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A reduced-precision streaming SpMV architecture for Personalized PageRank on FPGA
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2020-09-22 , DOI: arxiv-2009.10443
Alberto Parravicini, Francesco Sgherzi, Marco D. Santambrogio

Sparse matrix-vector multiplication is often employed in many data-analytic workloads in which low latency and high throughput are more valuable than exact numerical convergence. FPGAs provide quick execution times while offering precise control over the accuracy of the results thanks to reduced-precision fixed-point arithmetic. In this work, we propose a novel streaming implementation of Coordinate Format (COO) sparse matrix-vector multiplication, and study its effectiveness when applied to the Personalized PageRank algorithm, a common building block of recommender systems in e-commerce websites and social networks. Our implementation achieves speedups up to 6x over a reference floating-point FPGA architecture and a state-of-the-art multi-threaded CPU implementation on 8 different data-sets, while preserving the numerical fidelity of the results and reaching up to 42x higher energy efficiency compared to the CPU implementation.

中文翻译:

FPGA 上用于个性化 PageRank 的精度降低的流式 SpMV 架构

稀疏矩阵向量乘法通常用于许多数据分析工作负载,在这些工作负载中,低延迟和高吞吐量比精确的数值收敛更有价值。由于采用降低精度的定点算法,FPGA 可提供快速的执行时间,同时提供对结果准确性的精确控制。在这项工作中,我们提出了一种新的坐标格式 (COO) 稀疏矩阵向量乘法的流实现,并研究了其应用于个性化 PageRank 算法时的有效性,这是电子商务网站和社交网络中推荐系统的常见构建块。我们的实现比参考浮点 FPGA 架构和最先进的多线程 CPU 实现在 8 个不同的数据集上实现了高达 6 倍的加速,
更新日期:2020-09-23
down
wechat
bug