当前位置: X-MOL 学术ACM Trans. Reconfig. Technol. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Substream-Centric Maximum Matchings on FPGA
ACM Transactions on Reconfigurable Technology and Systems ( IF 2.3 ) Pub Date : 2020-05-04 , DOI: 10.1145/3377871
Maciej Besta 1 , Marc Fischer 1 , Tal Ben-Nun 1 , Dimitri Stanojevic 1 , Johannes De Fine Licht 1 , Torsten Hoefler 1
Affiliation  

Developing high-performance and energy-efficient algorithms for maximum matchings is becoming increasingly important in social network analysis, computational sciences, scheduling, and others. In this work, we propose the first maximum matching algorithm designed for FPGAs; it is energy-efficient and has provable guarantees on accuracy, performance, and storage utilization. To achieve this, we forego popular graph processing paradigms, such as vertex-centric programming, that often entail large communication costs. Instead, we propose a substream-centric approach, in which the input stream of data is divided into substreams processed independently to enable more parallelism while lowering communication costs. We base our work on the theory of streaming graph algorithms and analyze 14 models and 28 algorithms. We use this analysis to provide theoretical underpinning that matches the physical constraints of FPGA platforms. Our algorithm delivers high performance (more than 4× speedup over tuned parallel CPU variants), low memory, high accuracy, and effective usage of FPGA resources. The substream-centric approach could easily be extended to other algorithms to offer low-power and high-performance graph processing on FPGAs.

中文翻译:

FPGA 上以子流为中心的最大匹配

开发用于最大匹配的高性能和节能算法在社交网络分析、计算科学、调度等方面变得越来越重要。在这项工作中,我们提出了第一个为 FPGA 设计的最大匹配算法;它是节能的,并且在准确性、性能和存储利用率方面具有可证明的保证。为了实现这一点,我们放弃了流行的图形处理范例,例如以顶点为中心的编程,这通常需要大量的通信成本。相反,我们提出一个以子流为中心方法,其中输入的数据流被分成独立处理的子流,以实现更高的并行性,同时降低通信成本。我们的工作基于流图算法理论并分析 14 个模型和 28 个算法。我们使用这种分析来提供与 FPGA 平台的物理约束相匹配的理论基础。我们的算法提供了高性能(超过调整后的并行 CPU 变体的 4 倍加速)、低内存、高精度和 FPGA 资源的有效使用。以子流为中心的方法可以很容易地扩展到其他算法,以在 FPGA 上提供低功耗和高性能的图形处理。
更新日期:2020-05-04
down
wechat
bug