当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scalable Light-Weight Integration of FPGA Based Accelerators with Chip Multi-Processors
arXiv - CS - Hardware Architecture Pub Date : 2020-09-03 , DOI: arxiv-2009.01441
Zhe Lin, Sharad Sinha, Hao Liang, Liang Feng, Wei Zhang

Modern multicore systems are migrating from homogeneous systems to heterogeneous systems with accelerator-based computing in order to overcome the barriers of performance and power walls. In this trend, FPGA-based accelerators are becoming increasingly attractive, due to their excellent flexibility and low design cost. In this paper, we propose the architectural support for efficient interfacing between FPGA-based multi-accelerators and chip-multiprocessors (CMPs) connected through the network-on-chip (NoC). Distributed packet receivers and hierarchical packet senders are designed to maintain scalability and reduce the critical path delay under a heavy task load. A dedicated accelerator chaining mechanism is also proposed to facilitate intra-FPGA data reuse among accelerators to circumvent prohibitive communication overhead between the FPGA and processors. In order to evaluate the proposed architecture, a complete system emulation with programmability support is performed using FPGA prototyping. Experimental results demonstrate that the proposed architecture has high-performance, and is light-weight and scalable in characteristics.

中文翻译:

基于 FPGA 的加速器与芯片多处理器的可扩展轻量化集成

现代多核系统正在从同构系统迁移到具有基于加速器的计算的异构系统,以克服性能和电源壁垒的障碍。在这种趋势下,基于 FPGA 的加速器因其出色的灵活性和低设计成本而变得越来越有吸引力。在本文中,我们为基于 FPGA 的多加速器和通过片上网络 (NoC) 连接的芯片多处理器 (CMP) 之间的高效接口提出了架构支持。分布式数据包接收器和分层数据包发送器旨在保持可扩展性并减少繁重任务负载下的关键路径延迟。还提出了专用的加速器链接机制,以促进加速器之间的 FPGA 内数据重用,以规避 FPGA 和处理器之间过高的通信开销。为了评估所提出的架构,使用 FPGA 原型设计执行具有可编程性支持的完整系统仿真。实验结果表明,所提出的架构具有高性能、轻量级和可扩展性等特点。
更新日期:2020-09-04
down
wechat
bug