Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Performance Modeling for CNN Inference Accelerators on FPGA
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ( IF 2.9 ) Pub Date : 2020-04-01 , DOI: 10.1109/tcad.2019.2897634
Yufei Ma , Yu Cao , Sarma Vrudhula , Jae-Sun Seo

The recently reported successes of convolutional neural networks (CNNs) in many areas have generated wide interest in the development of field-programmable gate array (FPGA)-based accelerators. To achieve high performance and energy efficiency, an FPGA-based accelerator must fully utilize the limited computation resources and minimize the data communication and memory access, both of which are impacted and constrained by a variety of design parameters, e.g., the degree and dimension of parallelism, the size of on-chip buffers, the bandwidth of the external memory, and many more. The large design space of the accelerator makes it impractical to search for the optimal design in the implementation phase. To address this problem, a performance model is described to estimate the performance and resource utilization of an FPGA implementation. By this means, the performance bottleneck and design bound can be identified and the optimal design option can be explored early in the design phase. The proposed performance model is validated using a variety of CNN algorithms comparing the results with on-board test results on two different FPGAs.

中文翻译:

FPGA 上 CNN 推理加速器的性能建模

最近报道的卷积神经网络 (CNN) 在许多领域的成功引起了人们对基于现场可编程门阵列 (FPGA) 的加速器的开发的广泛兴趣。为了实现高性能和高能效,基于 FPGA 的加速器必须充分利用有限的计算资源,并最大限度地减少数据通信和内存访问,这两者都受到各种设计参数的影响和约束,例如并行性、片上缓冲区的大小、外部存储器的带宽等等。加速器的大设计空间使得在实现阶段寻找最优设计变得不切实际。为了解决这个问题,描述了一个性能模型来估计 FPGA 实现的性能和资源利用率。通过这种方式,可以识别性能瓶颈和设计界限,并可以在设计阶段的早期探索最佳设计选项。所提出的性能模型使用多种 CNN 算法进行验证,并将结果与​​两个不同 FPGA 上的板载测试结果进行比较。
更新日期:2020-04-01
down
wechat
bug