当前位置:
X-MOL 学术
›
arXiv.cs.RO
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Multi-modal Sensor Fusion-Based Deep Neural Network for End-to-end Autonomous Driving with Scene Understanding
arXiv - CS - Robotics Pub Date : 2020-05-19 , DOI: arxiv-2005.09202 Zhiyu Huang, Chen Lv, Yang Xing, Jingda Wu
arXiv - CS - Robotics Pub Date : 2020-05-19 , DOI: arxiv-2005.09202 Zhiyu Huang, Chen Lv, Yang Xing, Jingda Wu
This study aims to improve the performance and generalization capability of
end-to-end autonomous driving with scene understanding leveraging deep learning
and multimodal sensor fusion techniques. The designed end-to-end deep neural
network takes as input the visual image and associated depth information in an
early fusion level and outputs the pixel-wise semantic segmentation as scene
understanding and vehicle control commands concurrently. The end-to-end deep
learning-based autonomous driving model is tested in high-fidelity simulated
urban driving conditions and compared with the benchmark of CoRL2017 and
NoCrash. The testing results show that the proposed approach is of better
performance and generalization ability, achieving a 100% success rate in static
navigation tasks in both training and unobserved situations, as well as better
success rates in other tasks than the prior models. A further ablation study
shows that the model with the removal of multimodal sensor fusion or scene
understanding pales in the new environment because of the false perception. The
results verify that the performance of our model is improved by the synergy of
multimodal sensor fusion with scene understanding subtask, demonstrating the
feasibility and effectiveness of the developed deep neural network with
multimodal sensor fusion.
中文翻译:
基于多模态传感器融合的深度神经网络,用于具有场景理解的端到端自动驾驶
本研究旨在利用深度学习和多模态传感器融合技术通过场景理解来提高端到端自动驾驶的性能和泛化能力。设计的端到端深度神经网络将早期融合级别的视觉图像和相关深度信息作为输入,并同时输出像素级语义分割作为场景理解和车辆控制命令。基于端到端深度学习的自动驾驶模型在高保真模拟城市驾驶条件下进行测试,并与 CoRL2017 和 NoCrash 的基准进行比较。测试结果表明,所提出的方法具有更好的性能和泛化能力,在训练和未观察情况下静态导航任务的成功率均达到100%,以及在其他任务中比以前的模型更好的成功率。进一步的消融研究表明,由于错误感知,去除了多模态传感器融合或场景理解的模型在新环境中显得苍白无力。结果验证了多模态传感器融合与场景理解子任务的协同作用提高了我们模型的性能,证明了开发的具有多模态传感器融合的深度神经网络的可行性和有效性。
更新日期:2020-08-04
中文翻译:
基于多模态传感器融合的深度神经网络,用于具有场景理解的端到端自动驾驶
本研究旨在利用深度学习和多模态传感器融合技术通过场景理解来提高端到端自动驾驶的性能和泛化能力。设计的端到端深度神经网络将早期融合级别的视觉图像和相关深度信息作为输入,并同时输出像素级语义分割作为场景理解和车辆控制命令。基于端到端深度学习的自动驾驶模型在高保真模拟城市驾驶条件下进行测试,并与 CoRL2017 和 NoCrash 的基准进行比较。测试结果表明,所提出的方法具有更好的性能和泛化能力,在训练和未观察情况下静态导航任务的成功率均达到100%,以及在其他任务中比以前的模型更好的成功率。进一步的消融研究表明,由于错误感知,去除了多模态传感器融合或场景理解的模型在新环境中显得苍白无力。结果验证了多模态传感器融合与场景理解子任务的协同作用提高了我们模型的性能,证明了开发的具有多模态传感器融合的深度神经网络的可行性和有效性。