当前位置: X-MOL 学术J. Real-Time Image Proc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel hardware-oriented ultra-high-speed object detection algorithm based on convolutional neural network
Journal of Real-Time Image Processing ( IF 2.9 ) Pub Date : 2019-12-21 , DOI: 10.1007/s11554-019-00931-5
Jianquan Li , Xianlei Long , Shenhua Hu , Yiming Hu , Qingyi Gu , De Xu

This paper describes a hardware-oriented two-stage algorithm that can be deployed in a resource-limited field-programmable gate array (FPGA) for fast-object detection and recognition with out external memory. The first stage is the bounding boxes proposal with a conventional object detection method, and the second is convolutional neural network (CNN)-based classification for accuracy improvement. Frequently accessing external memories significantly affects the execution efficiency of object classification. Unfortunately, the existing CNN models with a large number of parameters are difficult to deploy in FPGAs with limited on-chip memory resources. In this study, we designed a compact CNN model and performed the hardware-oriented quantization for parameters and intermediate results. As a result, CNN-based ultra-fast-object classification was realized with all parameters and intermediate results stored on chip. Several evaluations were performed to demonstrate the performance of the proposed algorithm. The object classification module consumes only 163.67 Kbits of on-chip memories for ten regions of interest (ROIs), this is suitable for low-end FPGA devices. In the aspect of accuracy, our method provides a correctness rate of 98.01% in open-source data set MNIST and over 96.5% in other three self-built data sets, which is distinctly better than conventional ultra-high-speed object detection algorithms.

中文翻译:

基于卷积神经网络的面向硬件的新型超高速目标检测算法

本文介绍了一种面向硬件的两阶段算法,该算法可以部署在资源受限的现场可编程门阵列(FPGA)中,从而无需外部存储器即可进行快速对象检测和识别。第一阶段是采用常规目标检测方法的边界框建议,第二阶段是基于卷积神经网络(CNN)的分类,以提高准确性。频繁访问外部存储器会显着影响对象分类的执行效率。不幸的是,现有的具有大量参数的CNN模型很难在片上存储器资源有限的FPGA中部署。在这项研究中,我们设计了一个紧凑的CNN模型,并对参数和中间结果进行了面向硬件的量化。结果是,基于CNN的超快速目标分类实现了所有参数和中间结果存储在芯片上。进行了几次评估,以证明所提出算法的性能。对象分类模块仅消耗163.67 Kbit的片上存储器用于10个感兴趣的区域(ROI),这适用于低端FPGA器件。在准确性方面,我们的方法在开源数据集MNIST中提供了98.01%的正确率,在其他三个自建数据集中提供了96.5%的正确率,明显优于传统的超高速目标检测算法。用于十个感兴趣区域(ROI)的67 Kbit片上存储器,适用于低端FPGA器件。在准确性方面,我们的方法在开源数据集MNIST中提供了98.01%的正确率,在其他三个自建数据集中提供了96.5%的正确率,明显优于传统的超高速目标检测算法。用于十个感兴趣区域(ROI)的67 Kbit片上存储器,适用于低端FPGA器件。在准确性方面,我们的方法在开源数据集MNIST中提供了98.01%的正确率,在其他三个自建数据集中提供了96.5%的正确率,明显优于传统的超高速目标检测算法。
更新日期:2019-12-21
down
wechat
bug