当前位置: X-MOL 学术J. Intell. Robot. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Detecting Soccer Balls with Reduced Neural Networks
Journal of Intelligent & Robotic Systems ( IF 3.1 ) Pub Date : 2021-02-27 , DOI: 10.1007/s10846-021-01336-y
Douglas De Rizzo Meneghetti , Thiago Pedro Donadon Homem , Jonas Henrique Renolfi de Oliveira , Isaac Jesus da Silva , Danilo Hernani Perico , Reinaldo Augusto da Costa Bianchi

Object detection techniques that achieve state-of-the-art detection accuracy employ convolutional neural networks, implemented to have lower latency in graphics processing units. Some hardware systems, such as mobile robots, operate under constrained hardware situations, but still benefit from object detection capabilities. Multiple network models have been proposed, achieving comparable accuracy with reduced architectures and leaner operations. Motivated by the need to create a near real-time object detection system for a soccer team of mobile robots operating with x86 CPU-only embedded computers, this work analyses the average precision and inference time of multiple object detection systems in a constrained hardware setting. We train open implementations of MobileNetV2 and MobileNetV3 models with different underlying architectures, achieved by changing their input and width multipliers, as well as YOLOv3, TinyYOLOv3, YOLOv4 and TinyYOLOv4 in an annotated image dataset captured using a mobile robot. We emphasize the speed/accuracy trade-off in the models by reporting their average precision on a test data set and their inference time in videos at different resolutions, under constrained and unconstrained hardware configurations. Results show that MobileNetV3 models have a good trade-off between average precision and inference time in constrained scenarios only, while MobileNetV2 with high width multipliers are appropriate for server-side inference. YOLO models in their official implementations are not suitable for inference in CPUs.



中文翻译:

用减少的神经网络检测足球

实现最新检测精度的对象检测技术采用卷积神经网络,该卷积神经网络被实现为在图形处理单元中具有较低的延迟。一些硬件系统(例如移动机器人)在受限的硬件情况下运行,但仍受益于对象检测功能。已经提出了多种网络模型,以减少的体系结构和更精简的操作获得了相当的准确性。由于需要为一支足球运动团队创建一个近实时的对象检测系统,该团队使用仅x86 CPU嵌入式计算机进行操作,因此在受限的硬件设置下分析了多个对象检测系统的平均精度和推理时间。我们使用不同的基础架构来培训MobileNetV2和MobileNetV3模型的开放式实现,通过在使用移动机器人捕获的带注释的图像数据集中更改输入和宽度乘数以及YOLOv3,TinyYOLOv3,YOLOv4和TinyYOLOv4来实现。我们通过在受约束和不受约束的硬件配置下,通过报告测试数据集的平均精度以及在不同分辨率的视频中的推理时间来报告模型中的速度/精度权衡。结果表明,仅在受限情况下,MobileNetV3模型才能在平均精度和推理时间之间取得良好的权衡,而具有高宽度乘法器的MobileNetV2则适合服务器端推理。YOLO模型在其官方实现中不适合在CPU中进行推断。使用移动机器人捕获的带注释的图像数据集中的YOLOv4和TinyYOLOv4。我们通过在受约束和不受约束的硬件配置下,通过报告测试数据集的平均精度以及在不同分辨率的视频中的推理时间来报告模型中的速度/精度权衡。结果表明,仅在受限情况下,MobileNetV3模型才能在平均精度和推理时间之间取得良好的权衡,而具有高宽度乘法器的MobileNetV2则适合服务器端推理。YOLO模型在其官方实现中不适合在CPU中进行推断。使用移动机器人捕获的带注释的图像数据集中的YOLOv4和TinyYOLOv4。我们通过在受约束和不受约束的硬件配置下,通过报告测试数据集的平均精度以及在不同分辨率的视频中的推理时间来报告模型中的速度/精度权衡。结果表明,仅在受限情况下,MobileNetV3模型才能在平均精度和推理时间之间取得良好的权衡,而具有高宽度乘法器的MobileNetV2则适合服务器端推理。YOLO模型在其官方实现中不适合在CPU中进行推断。在受限和不受约束的硬件配置下。结果表明,仅在受限情况下,MobileNetV3模型才能在平均精度和推理时间之间取得良好的权衡,而具有高宽度乘法器的MobileNetV2则适合服务器端推理。YOLO模型在其官方实现中不适合在CPU中进行推断。在受限和不受约束的硬件配置下。结果表明,仅在受限情况下,MobileNetV3模型才能在平均精度和推理时间之间取得良好的权衡,而具有高宽度乘法器的MobileNetV2则适合服务器端推理。YOLO模型在其官方实现中不适合在CPU中进行推断。

更新日期:2021-02-28
down
wechat
bug