当前位置: X-MOL 学术J. Real-Time Image Proc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A fast and effective video vehicle detection method leveraging feature fusion and proposal temporal link
Journal of Real-Time Image Processing ( IF 2.9 ) Pub Date : 2021-05-18 , DOI: 10.1007/s11554-021-01121-y
Yanni Yang , Huansheng Song , Shijie Sun , Wentao Zhang , Yan Chen , Lionel Rakal , Yong Fang

Vehicle detection in videos is a valuable but challenging technology in traffic monitoring. Due to the advantage of real-time detection, Single Shot MultiBox Detector (SSD) is often used to detect vehicles in images. However, the accuracy degradation caused by SSD is one of the significant problems in video vehicle detection. To address this problem in real time, this paper enhances the detection performance by improving the SSD and employing the relationship of inter-frame detections. We propose a feature-fused SSD detector and a Tracking-guided Detections Optimizing (TDO) strategy for fast and effective video vehicle detection. We introduce a lightweight feature fusion sub-network to the standard SSD network, which aggregate the deeper layer features into the shallower layer features to enhance the semantic information of the shallower layer features. At the post-processing stage of the feature-fused SSD, the non-maximum suppression (NMS) is replaced by the TDO strategy, which link vehicles of inter-frames by fast tracking algorithm. Thus the missed detections can be compensated by the propagated results, and the confidence of the final results can be optimized in the temporal. Our approach significantly improves the temporal consistency of the detection results with lower complexity computations. We evaluate the proposed method on two datasets. The experiments on our labeled highway dataset show that the mean average precision (mAP) of our method is 8.2% higher than that of the base detector. The runtime of our feature-fused SSD is 27.1 frames per second (fps), which is suitable for real-time detection. The experiments on the ImageNet VID dataset prove that the proposed method is comparable with the state-of-the-art detectors as well.



中文翻译:

利用特征融合和建议时间链接的快速有效的视频车辆检测方法

视频中的车辆检测是交通监控中一种有价值但具有挑战性的技术。由于实时检测的优势,单发多盒检测器(SSD)通常用于检测图像中的车辆。然而,由SSD引起的精度降低是视频车辆检测中的重要问题之一。为了实时解决这个问题,本文通过改进SSD并利用帧间检测的关系来提高检测性能。我们提出了一种功能融合的SSD检测器和跟踪引导的检测优化(TDO)策略,以实现快速有效的视频车辆检测。我们在标准SSD网络中引入了轻量级功能融合子网,它将较深层特征聚合为较浅层特征,以增强较浅层特征的语义信息。在特征融合SSD的后处理阶段,非最大抑制(NMS)被TDO策略取代,后者通过快速跟踪算法链接帧间车辆。因此,可以通过传播的结果来补偿错过的检测,并且可以在时间上优化最终结果的置信度。我们的方法以较低的复杂度计算显着改善了检测结果的时间一致性。我们在两个数据集上评估了所提出的方法。在我们标记的高速公路数据集上的实验表明,我们的方法的平均平均精度(mAP)比基本检测器的平均精度高8.2%。我们的功能融合型SSD的运行时间为27。每秒1帧(fps),适合实时检测。在ImageNet VID数据集上的实验证明,该方法也可以与最新的检测器相提并论。

更新日期:2021-05-18
down
wechat
bug