当前位置: X-MOL 学术IEEE Trans. Circuit Syst. II Express Briefs › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
FPGA-based Implementation of a Real-Time Object Recognition System using Convolutional Neural Network
IEEE Transactions on Circuits and Systems II: Express Briefs ( IF 4.4 ) Pub Date : 2020-04-01 , DOI: 10.1109/tcsii.2019.2922372
Ali Azarmi Gilan , Mohammad Emad , Bijan Alizadeh

High computational complexity and power consumption makes convolutional neural networks (CNNs) ineligible for real-time embedded applications. In this brief, we introduce a low power and flexible platform as a hardware accelerator for CNNs. The proposed architecture is fully configurable by a software library so that it can perform different CNN models with a reconfigurable hardware. The hardware accelerator is evaluated on a ZC706 evaluation board. We make use of the AlexNet architecture in a real-time object recognition application to demonstrate the effectiveness of the proposed CNN accelerator. The results show that the performance rates of 198.1 GOP/s using 512 DSP blocks and 23.14 GOP/s using 64 DSP blocks are achievable for the convolution and fully connected layers, respectively. Moreover, images are processed at 82 frames/s, which is significantly higher than existing implementations.

中文翻译:

基于 FPGA 的使用卷积神经网络的实时对象识别系统的实现

高计算复杂性和功耗使得卷积神经网络 (CNN) 不适合实时嵌入式应用。在这个简介中,我们介绍了一个低功耗且灵活的平台作为 CNN 的硬件加速器。所提出的架构可以由软件库完全配置,因此它可以使用可重新配置的硬件执行不同的 CNN 模型。硬件加速器在 ZC706 评估板上进行评估。我们在实时对象识别应用程序中使用 AlexNet 架构来证明所提出的 CNN 加速器的有效性。结果表明,卷积层和全连接层分别可以实现使用 512 个 DSP 块的 198.1 GOP/s 和使用 64 个 DSP 块的 23.14 GOP/s 的性能速率。此外,图像以 82 帧/秒的速度处理,
更新日期:2020-04-01
down
wechat
bug