GEM: Glare or Gloom, I Can Still See You -- End-to-End Multimodal Object Detector,arXiv - CS - Robotics

当前位置： X-MOL 学术 › arXiv.cs.RO › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

GEM: Glare or Gloom, I Can Still See You -- End-to-End Multimodal Object Detector
arXiv - CS - Robotics Pub Date : 2021-02-24 , DOI: arxiv-2102.12319
Osama Mazhar, Jens Kober, Robert Babuska

Deep neural networks designed for vision tasks are often prone to failure when they encounter environmental conditions not covered by the training data. Efficient fusion strategies for multi-sensor configurations can enhance the robustness of the detection algorithms by exploiting redundancy from different sensor streams. In this paper, we propose sensor-aware multi-modal fusion strategies for 2D object detection in harsh-lighting conditions. Our network learns to estimate the measurement reliability of each sensor modality in the form of scalar weights and masks, without prior knowledge of the sensor characteristics. The obtained weights are assigned to the extracted feature maps which are subsequently fused and passed to the transformer encoder-decoder network for object detection. This is critical in the case of asymmetric sensor failures and to prevent any tragic consequences. Through extensive experimentation, we show that the proposed strategies out-perform the existing state-of-the-art methods on the FLIR-Thermal dataset, improving the mAP up-to 25.2%. We also propose a new "r-blended" hybrid depth modality for RGB-D multi-modal detection tasks. Our proposed method also obtained promising results on the SUNRGB-D dataset.

中文翻译：

创业板：眩光或阴郁，我仍然可以看到您-端到端多模式对象检测器

当为视觉任务设计的深层神经网络遇到训练数据未涵盖的环境条件时，往往容易出现故障。通过利用来自不同传感器流的冗余，用于多传感器配置的有效融合策略可以增强检测算法的鲁棒性。在本文中，我们提出了在恶劣光照条件下用于二维物体检测的传感器感知多模式融合策略。我们的网络学会了在不事先了解传感器特性的情况下，以标量权重和掩码的形式来估计每个传感器模态的测量可靠性。将获得的权重分配给提取的特征图，然后将这些特征图融合并传递到变压器编码器-解码器网络以进行目标检测。这对于非对称传感器故障以及防止任何悲剧性后果至关重要。通过广泛的实验，我们证明了所提出的策略优于FLIR-Thermal数据集上现有的最新技术，将mAP提升至25.2％。我们还为RGB-D多模式检测任务提出了一种新的“ r-混合”混合深度模式。我们提出的方法在SUNRGB-D数据集上也获得了可喜的结果。

更新日期：2021-02-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文