当前位置: X-MOL 学术Int. J. Adv. Robot. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
JD-SLAM: Joint camera pose estimation and moving object segmentation for simultaneous localization and mapping in dynamic scenes
International Journal of Advanced Robotic Systems ( IF 2.3 ) Pub Date : 2021-02-24 , DOI: 10.1177/1729881421994447
Yujia Zhai 1, 2, 3 , Baoli Lu 1, 2, 3 , Weijun Li 1, 2, 3 , Jian Xu 1, 2, 3 , Shuangyi Ma 1, 2, 3
Affiliation  

As a fundamental assumption in simultaneous localization and mapping, the static scenes hypothesis can be hardly fulfilled in applications of indoor/outdoor navigation or localization. Recent works about simultaneous localization and mapping in dynamic scenes commonly use heavy pixel-level segmentation net to distinguish dynamic objects, which brings enormous calculations and limits the real-time performance of the system. That restricts the application of simultaneous localization and mapping on the mobile terminal. In this article, we present a lightweight system for monocular simultaneous localization and mapping in dynamic scenes, which can run in real time on central processing unit (CPU) and generate a semantic probability map. The pixel-wise semantic segmentation net is replaced with a lightweight object detection net combined with three-dimensional segmentation based on motion clustering. And a framework integrated with an improved weighted-random sample consensus solver is proposed to jointly solve the camera pose and perform three-dimensional object segmentation, which enables high accuracy and efficiency. Besides, the prior information of the generated map and the object detection results is introduced for better estimation. The experiments on the public data set, and in the real-world demonstrate that our method obtains an outstanding improvement in both accuracy and speed compared to state-of-the-art methods.



中文翻译:

JD-SLAM:联合摄像机姿态估计和运动对象分割,可在动态场景中同时定位和映射

作为同时定位和制图的基本假设,静态场景假设很难在室内/室外导航或定位应用中实现。关于动态场景中同时定位和映射的最新工作通常使用重像素级分割网来区分动态对象,这带来了巨大的计算量,并限制了系统的实时性能。这限制了同时定位和映射在移动终端上的应用。在本文中,我们提出了一种用于动态场景中单眼同时定位和映射的轻量级系统,该系统可以在中央处理器(CPU)上实时运行并生成语义概率图。像素级语义分割网被轻量级目标检测网取代,该网结合了基于运动聚类的三维分割。提出了一种与改进的加权随机样本共识求解器集成的框架,可以共同求解相机姿态并进行三维物体分割,从而实现了高精度和高效率。此外,还介绍了生成的地图的先验信息和目标检测结果,以进行更好的估计。在公共数据集上以及在现实世界中进行的实验表明,与最新方法相比,我们的方法在准确性和速度上均获得了显着提高。提出了一种与改进的加权随机样本共识求解器集成的框架,可以共同求解相机姿态并进行三维物体分割,从而实现了高精度和高效率。此外,还介绍了生成的地图的先验信息和目标检测结果,以进行更好的估计。在公共数据集上以及在现实世界中进行的实验表明,与最新方法相比,我们的方法在准确性和速度上均获得了显着提高。提出了一种与改进的加权随机样本共识求解器集成的框架,可以共同求解相机姿态并进行三维物体分割,从而实现了高精度和高效率。此外,还介绍了生成的地图的先验信息和目标检测结果,以进行更好的估计。在公共数据集上以及在现实世界中进行的实验表明,与最新方法相比,我们的方法在准确性和速度上均获得了显着提高。

更新日期:2021-02-24
down
wechat
bug