当前位置: X-MOL 学术IEEE J. Emerg. Sel. Top. Circuits Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hybrid Fixed Point/Binary Deep Neural Network Design Methodology for Low Power Object Detection
IEEE Journal on Emerging and Selected Topics in Circuits and Systems ( IF 4.6 ) Pub Date : 2020-09-01 , DOI: 10.1109/jetcas.2020.3015753
Jiun-In Guo , Chia-Chi Tsai , Jian-Lin Zeng , Shao-Wei Peng , En-Chih Chang

Suffering from both high computational complexity and high memory bandwidth is the major challenge in realizing the deep neural network in low-power for real-time applications. Binarizing the feature maps as well as the filter coefficients in deep neural network is an efficient way to reduce the high power consumption in deep learning object detection, however, it greatly scarifies the detection accuracy when reducing the bit-width of a 32-bit word to a binary bit in a floating-point deep neural network. This paper proposes a hybrid fixed point/binary deep neural network design methodology for object detection to achieve low-power consumption by taking advantage of both the fixed-point and binary deep neural networks, which allocates enough bit-width to design the hardware datapath in different layers of deep neural network. The proposed methodology combines dynamic fixed-point quantization and binarization techniques together to extremely compress the object detection model to result in a compact hybrid fixed-point/binary detection neural network, which achieves lower bandwidth and lower computational complexity. An automation tool based on the proposed methodology is also developed to train a hybrid deep neural network under a specified quality loss range. Taking MobileNet-SSD as an example, using the proposed methodology, the resulted model achieves 91% model size reduction and 75.8% memory bandwidth reduction at the cost of less than 1% mAP quality degradation. The proposed design methodology for hybrid fixed-point/binary deep neural networks achieves a good balance on detection accuracy, model size compression ratio and feature map reduction for low-power deep learning object detection applications.

中文翻译:

用于低功耗目标检测的混合定点/二进制深度神经网络设计方法

高计算复杂度和高内存带宽是实现实时应用的低功耗深度神经网络的主要挑战。在深度神经网络中对特征图和滤波器系数进行二值化是降低深度学习对象检测高功耗的有效方法,但是,当减少 32 位字的位宽时,它会大大降低检测精度到浮点深度神经网络中的二进制位。本文提出了一种用于目标检测的混合定点/二进制深度神经网络设计方法,通过利用定点和二进制深度神经网络来实现低功耗,分配足够的位宽来设计硬件数据路径。不同层的深度神经网络。所提出的方法将动态定点量化和二值化技术结合在一起,对目标检测模型进行了极大的压缩,从而形成了一个紧凑的混合定点/二值检测神经网络,从而实现了更低的带宽和更低的计算复杂度。还开发了基于所提出方法的自动化工具,以在指定的质量损失范围内训练混合深度神经网络。以 MobileNet-SSD 为例,使用所提出的方法,结果模型以小于 1% 的 mAP 质量下降为代价,实现了 91% 的模型尺寸缩减和 75.8% 的内存带宽缩减。所提出的混合定点/二进制深度神经网络的设计方法在检测精度上取得了良好的平衡,
更新日期:2020-09-01
down
wechat
bug