当前位置:
X-MOL 学术
›
arXiv.cs.RO
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
GEM: Glare or Gloom, I Can Still See You -- End-to-End Multimodal Object Detector
arXiv - CS - Robotics Pub Date : 2021-02-24 , DOI: arxiv-2102.12319 Osama Mazhar, Jens Kober, Robert Babuska
arXiv - CS - Robotics Pub Date : 2021-02-24 , DOI: arxiv-2102.12319 Osama Mazhar, Jens Kober, Robert Babuska
Deep neural networks designed for vision tasks are often prone to failure
when they encounter environmental conditions not covered by the training data.
Efficient fusion strategies for multi-sensor configurations can enhance the
robustness of the detection algorithms by exploiting redundancy from different
sensor streams. In this paper, we propose sensor-aware multi-modal fusion
strategies for 2D object detection in harsh-lighting conditions. Our network
learns to estimate the measurement reliability of each sensor modality in the
form of scalar weights and masks, without prior knowledge of the sensor
characteristics. The obtained weights are assigned to the extracted feature
maps which are subsequently fused and passed to the transformer encoder-decoder
network for object detection. This is critical in the case of asymmetric sensor
failures and to prevent any tragic consequences. Through extensive
experimentation, we show that the proposed strategies out-perform the existing
state-of-the-art methods on the FLIR-Thermal dataset, improving the mAP up-to
25.2%. We also propose a new "r-blended" hybrid depth modality for RGB-D
multi-modal detection tasks. Our proposed method also obtained promising
results on the SUNRGB-D dataset.
中文翻译:
创业板:眩光或阴郁,我仍然可以看到您-端到端多模式对象检测器
当为视觉任务设计的深层神经网络遇到训练数据未涵盖的环境条件时,往往容易出现故障。通过利用来自不同传感器流的冗余,用于多传感器配置的有效融合策略可以增强检测算法的鲁棒性。在本文中,我们提出了在恶劣光照条件下用于二维物体检测的传感器感知多模式融合策略。我们的网络学会了在不事先了解传感器特性的情况下,以标量权重和掩码的形式来估计每个传感器模态的测量可靠性。将获得的权重分配给提取的特征图,然后将这些特征图融合并传递到变压器编码器-解码器网络以进行目标检测。这对于非对称传感器故障以及防止任何悲剧性后果至关重要。通过广泛的实验,我们证明了所提出的策略优于FLIR-Thermal数据集上现有的最新技术,将mAP提升至25.2%。我们还为RGB-D多模式检测任务提出了一种新的“ r-混合”混合深度模式。我们提出的方法在SUNRGB-D数据集上也获得了可喜的结果。
更新日期:2021-02-25
中文翻译:
创业板:眩光或阴郁,我仍然可以看到您-端到端多模式对象检测器
当为视觉任务设计的深层神经网络遇到训练数据未涵盖的环境条件时,往往容易出现故障。通过利用来自不同传感器流的冗余,用于多传感器配置的有效融合策略可以增强检测算法的鲁棒性。在本文中,我们提出了在恶劣光照条件下用于二维物体检测的传感器感知多模式融合策略。我们的网络学会了在不事先了解传感器特性的情况下,以标量权重和掩码的形式来估计每个传感器模态的测量可靠性。将获得的权重分配给提取的特征图,然后将这些特征图融合并传递到变压器编码器-解码器网络以进行目标检测。这对于非对称传感器故障以及防止任何悲剧性后果至关重要。通过广泛的实验,我们证明了所提出的策略优于FLIR-Thermal数据集上现有的最新技术,将mAP提升至25.2%。我们还为RGB-D多模式检测任务提出了一种新的“ r-混合”混合深度模式。我们提出的方法在SUNRGB-D数据集上也获得了可喜的结果。