当前位置: X-MOL 学术Int. J. Appl. Earth Obs. Geoinf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Object detection from UAV thermal infrared images and videos using YOLO models
International Journal of Applied Earth Observation and Geoinformation ( IF 7.6 ) Pub Date : 2022-07-18 , DOI: 10.1016/j.jag.2022.102912
Chenchen Jiang , Huazhong Ren , Xin Ye , Jinshun Zhu , Hui Zeng , Yang Nan , Min Sun , Xiang Ren , Hongtao Huo

Object detection is one of the most crucial tasks in computer vision and remote sensing to identify specific categories of various objects in images. The unmanned aerial vehicle (UAV)-based thermal infrared (TIR) remote sensing multi-scenario images and videos are two important data sources in public security. However, their object detection process is still challenging because of the complicated scene information, coarse resolution compared with the visible videos and lack of public labelled datasets and training models. This study proposed a UAV TIR object detection framework for images and videos. The You Only Look Once (YOLO) models based on Convolutional Neural Network (CNN) architecture were designed to extract features from ground-based TIR images and videos, which were captured by Forward-looking Infrared (FLIR) cameras. The most effective algorithm was finally identified by evaluation metrics and then applied to detect objects on TIR videos from UAVs. Results showed that the highest mean average precision (mAP) of the person and car instances was 88.69% in the validating task. The fastest detection speed achieved 50 frames per second (FPS), and the smallest model size was observed in YOLOv5-s. In the application, the cross-detection performance on persons and cars in UAV TIR videos under a YOLOv5-s model was discussed in terms of the different UAVs’ observation angles and the effectiveness of the YOLO architecture was revealed. This study provides positive support for the qualitative and quantitative evaluation of objection detection from TIR images and videos using deep-learning models.



中文翻译:

使用 YOLO 模型从无人机热红外图像和视频中进行目标检测

对象检测是计算机视觉和遥感中最重要的任务之一,用于识别图像中各种对象的特定类别。基于无人机(UAV)的热红外(TIR)遥感多场景图像和视频是公共安全领域的两个重要数据源。然而,由于场景信息复杂,与可见视频相比分辨率较粗,并且缺乏公共标记数据集和训练模型,他们的目标检测过程仍然具有挑战性。本研究提出了一种用于图像和视频的无人机 TIR 目标检测框架。基于卷积神经网络 (CNN) 架构的 You Only Look Once (YOLO) 模型旨在从前视红外 (FLIR) 相机捕获的地面 TIR 图像和视频中提取特征。最有效的算法最终通过评估指标确定,然后应用于检测来自无人机的 TIR 视频上的对象。结果表明,在验证任务中,人和汽车实例的最高平均精度(mAP)为 88.69%。最快的检测速度达到每秒 50 帧(FPS),在 YOLOv5-s 中观察到最小的模型尺寸。在该应用中,从不同无人机的观察角度讨论了 YOLOv5-s 模型下无人机 TIR 视频中人和汽车的交叉检测性能,并揭示了 YOLO 架构的有效性。本研究为使用深度学习模型从 TIR 图像和视频中进行目标检测的定性和定量评估提供了积极的支持。

更新日期:2022-07-19
down
wechat
bug