当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Exploiting RapidWright in the Automatic Generation of Application-Specific FPGA Overlays
arXiv - CS - Hardware Architecture Pub Date : 2020-01-31 , DOI: arxiv-2001.11886
Joel Mandebi Mbongue, Danielle Tchuinkou Kwadjo and Christophe Bobda

Overlay architectures implemented on FPGA devices have been proposed as a means to increase FPGA adoption in general-purpose computing. They provide the benefits of software such as flexibility and programmability, thus making it easier to build dedicated compilers. However, existing overlays are generic, resource and power hungry with performance usually an order of magnitude lower than bare metal implementations. As a result, FPGA overlays have been confined to research and some niche applications. In this paper, we introduce Application-Specific FPGA Overlays (AS-Overlays), which can provide bare-metal performance to FPGA overlays, thus opening doors for broader adoption. Our approach is based on the automatic extraction of hardware kernels from data flow applications. Extracted kernels are then leveraged for application-specific generation of hardware accelerators. Reconfiguration of the overlay is done with RapidWright which allows to bypass the HDL design flow. Through prototyping, we demonstrated the viability and relevance of our approach. Experiments show a productivity improvement up to 20x compared to the state of the art FPGA overlays, while achieving over 1.33x higher Fmax than direct FPGA implementation and the possibility of lower resource and power consumption compared to bare metal.

中文翻译:

在应用特定的 FPGA 覆盖的自动生成中利用 RapidWright

在 FPGA 设备上实现的覆盖架构已被提议作为增加 FPGA 在通用计算中采用的一种手段。它们提供了软件的优点,例如灵活性和可编程性,从而更容易构建专用编译器。然而,现有的覆盖是通用的、资源和功耗的,其性能通常比裸机实现低一个数量级。因此,FPGA 覆盖仅限于研究和一些利基应用。在本文中,我们介绍了特定于应用的 FPGA 覆盖 (AS-Overlays),它可以为 FPGA 覆盖提供裸机性能,从而为更广泛的采用打开大门。我们的方法基于从数据流应用程序中自动提取硬件内核。然后将提取的内核用于特定于应用程序的硬件加速器生成。覆盖的重新配置是通过 RapidWright 完成的,它允许绕过 HDL 设计流程。通过原型设计,我们展示了我们方法的可行性和相关性。实验表明,与最先进的 FPGA 覆盖相比,生产力提高了 20 倍,同时实现的 Fmax 比直接 FPGA 实现高 1.33 倍以上,并且与裸机相比,资源和功耗可能更低。
更新日期:2020-02-10
down
wechat
bug