Journal of Real-Time Image Processing ( IF 2.9 ) Pub Date : 2021-06-19 , DOI: 10.1007/s11554-021-01145-4 Jin Han , Yonghao Yang
Object detection algorithms based on deep learning have made continuous progress in recent years. On the premise of ensuring the accuracy of object detection, reducing model complexity and improving detection speed have always been the goals pursued by current object detection algorithms. A lightweight object detection model its backbone based on ShuffleNetV2 network structure named L-Net is presented in this paper. A suitable backbone network was obtained by changing from 3 × 3 depth convolution to 5 × 5 depth convolution and reducing the number of input channels. In order to obtain a more discriminative image feature description, Pyramid Pooling Module and Attention Pyramid Module are added after the backbone network. Experimental results show that the L-Net model only uses 1.54B FLOPs (floating point operations) to achieve 70.2% mAP (mean average precision) on PASCAL VOC2007 and 21.8% mAP on the MS COCO dataset. The model has achieved competitive results in terms of accuracy and speed while being lightweight.
中文翻译:
L-Net:轻量级、快速的基于对象检测器的 ShuffleNetV2
近年来,基于深度学习的目标检测算法取得了不断的进步。在保证物体检测精度的前提下,降低模型复杂度、提高检测速度一直是当前物体检测算法所追求的目标。本文提出了一种基于ShuffleNetV2网络结构的轻量级物体检测模型L-Net。通过将 3 × 3 深度卷积更改为 5 × 5 深度卷积并减少输入通道数,获得了合适的主干网络。为了获得更具判别力的图像特征描述,在骨干网络之后添加了 Pyramid Pooling Module 和 Attention Pyramid Module。实验结果表明,L-Net 模型仅使用 1.54B FLOPs(浮点运算)就达到了 70.2%PASCAL VOC2007 上的mAP(平均精度)和MS COCO 数据集上的21.8% mAP。该模型在重量轻的同时,在精度和速度方面取得了具有竞争力的结果。