当前位置: X-MOL 学术IEEE Trans. Intell. Transp. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Neural Network Based Vehicle and Pedestrian Detection for Autonomous Driving: A Survey
IEEE Transactions on Intelligent Transportation Systems ( IF 8.5 ) Pub Date : 2021-05-25 , DOI: 10.1109/tits.2020.2993926
Long Chen , Shaobo Lin , Xiankai Lu , Dongpu Cao , Hangbin Wu , Chi Guo , Chun Liu , Fei-Yue Wang

Vehicle and pedestrian detection is one of the critical tasks in autonomous driving. Since heterogeneous techniques have been proposed, the selection of a detection system with an appropriate balance among detection accuracy, speed and memory consumption for a specific task has become very challenging. To deal with this issue and to provide guidance for model selection, this paper analyzes several mainstream object detection architectures, including Faster R-CNN, R-FCN, and SSD, along with several typical feature extractors, such as ResNet50, ResNet101, MobileNet_V1, MobileNet_V2, Inception_V2 and Inception_ResNet_V2. By conducting extensive experiments using the KITTI benchmark, which is a commonly used street dataset, we demonstrate that Faster R-CNN ResNet50 obtains the best average precision (AP) (58%) for vehicle and pedestrian detection, with a speed of 8.6 FPS. Faster R-CNN Inception_V2 performs best for detecting cars and detecting pedestrians respectively (74.5% and 47.3%). ResNet101 consumes the highest memory (9907 MB) and has the largest number of parameters (64.42 millions), and Inception_ResNet_V2 is the slowest model (3.05 FPS). SSD MobileNet_V2 is the fastest model (70 FPS), and SSD MobileNet_V1 is the lightest model in terms of memory usage (875 MB), both of which are suitable for applications on mobile and embedded devices.

中文翻译:

基于深度神经网络的自动驾驶车辆和行人检测:一项调查

车辆和行人检测是自动驾驶的关键任务之一。由于提出了异构技术,因此为特定任务选择在检测精度、速度和内存消耗之间具有适当平衡的检测系统变得非常具有挑战性。为了解决这个问题并为模型选择提供指导,本文分析了几种主流的物体检测架构,包括Faster R-CNN、R-FCN和SSD,以及几种典型的特征提取器,如ResNet50、ResNet101、MobileNet_V1、 MobileNet_V2、Inception_V2 和 Inception_ResNet_V2。通过使用常用的街道数据集 KITTI 基准进行大量实验,我们证明 Faster R-CNN ResNet50 获得了车辆和行人检测的最佳平均精度 (AP) (58%),速度为 8.6 FPS。Faster R-CNN Inception_V2 分别在检测汽车和检测行人方面表现最佳(74.5% 和 47.3%)。ResNet101 消耗内存最高(9907 MB),参数数量最多(6442 万),Inception_ResNet_V2 是最慢的模型(3.05 FPS)。SSD MobileNet_V2 是最快的模型(70 FPS),SSD MobileNet_V1 是内存使用最轻的模型(875 MB),两者都适用于移动和嵌入式设备上的应用。
更新日期:2021-06-01
down
wechat
bug