Abstract
Human-computer interaction requires accurate localization and effective mapping, while dynamic objects can influence the accuracy of localization and mapping. State-of-the-art SLAM algorithms assume that the environment is static. This paper proposes a new SLAM method that uses mask R-CNN to detect dynamic ob-jects in the environment and build a map containing semantic information. In our method, the reprojection error, photometric error and depth error are used to assign a robust weight to each keypoint. Thus, the dynamic points and the static points can be separated, and the geometric segmentation of the dynamic objects can be realized by using the dynamic keypoints. Each pixel is assigned a semantic label to rebuild a semantic map. Finally, our proposed method is tested on the TUM RGB-D dataset, and the experimental results show that the proposed method outperforms state-of-the-art SLAM algorithms in dynamic environments.
Similar content being viewed by others
References
Alcantarilla PF, Yebes JJ, Almaz´an J, et al (2012) On combining visual SLAM and dense scene flow to increase the robustness of localization and mapping in dynamic environments. In: 2012 IEEE international conference on robotics and automation. IEEE, pp 1290–1297
Bakkay, M. C., Arafa, M., & Zagrouba, E. (2015). Dense 3D SLAM in dynamic scenes using Kinect. Iberian conference on pattern recognition and image analysis (pp. 121–129). Cham: Springer.
Bescos, B., F´acil, J. M., Civera, J., et al. (2018). DynaSLAM: tracking, mapping, and inpainting in dynamic scenes. IEEE Robot Autom Lett, 3(4), 4076–4083.
Bouguet, J. Y. (2001). Pyramidal implementation of the affine lu- cas kanade feature tracker description of the algorithm. Intel Corp, 5(1–10), 4.
Cui L, Wen F (2019) A monocular ORB-SLAM in dynamic environments. In: Journal of Physics Conference Series vol 1168. IOP Publishing, Bristol, p 052037
Engel, J., Koltun, V., & Cremers, D. (2017). Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(3), 611–625.
Engel, J., Schops, T., & Cremers, D. (2014). LSD-SLAM: large-scale direct monocular SLAM. European conference on computer vision (pp. 834–849). Cham: Springer.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
He K, Gkioxari G, Doll´ar P, et al (2017) Mask rcnn. In: Proceedings of the IEEE international conference on computer vision. pp 2961–2969
Hornung, A., Wurm, K. M., Bennewitz, M., Stachniss, C., & Burgard, W. (2013). OctoMap: an efficient probabilistic 3D mapping framework based on octrees. Autonom Robots, 34(3), 189–206.
Kerl C, Sturm J, Cremers D (2013) Robust odometry estimation for RGB-D cameras. In: 2013 IEEE international conference on robotics and automation. IEEE, pp 3748–3754
Kerl C, Sturm J, Cremers D (2013) Dense visual SLAM for RGB-D cameras. In: 2013 IEEE/RSJ international conference on intelligent robots and systems. IEEE, pp 2100–2106
Klein G, Murray D (2007) Parallel tracking and mapping for small AR workspaces. In: Proceedings of the 2007 6th IEEE and ACM international symposium on mixed and augmented reality. IEEE Computer Society, pp 1–10
Li, S., & Lee, D. (2017). RGB-D SLAM in dynamic environments using static point weighting. IEEE Robot Autom Lett, 2(4), 2263–2270.
Mur-Artal, R., & Tard´os, J. D. (2017). Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Transactions on Robotics, 33(5), 1255–1262.
Sturm J, Engelhard N, Endres F, et al (2021) A benchmark for the evaluation of RGB-D SLAM systems. In: 2012 IEEE/RSJ international conference on intelligent robots and systems. IEEE, pp 573–580
Sun, Y., Liu, M., & Meng, M. Q. H. (2017). Improving RGB-D SLAM in dynamic environments: a motion removal approach. Robot Auton Syst, 89, 110–122.
Tan W, Liu H, Dong Z, et al (2013) Robust monocular SLAM in dynamic environments. In: IEEE international symposium on mixed and augmented reality (ISMAR). IEEE, pp 209–218
Wang, R., Wan, W., Wang, Y., et al. (2019). A new RGB-D SLAM method with moving object detection for dynamic indoor scenes. Remote Sensing, 11(10), 1143.
Xu B, Li W, Tzoumanikas D, et al (2019) Mid-fusion: octreebased Objectlevel multi-instance dynamic SLAM. 2019 international conference on robotics and automation (ICRA). IEEE, pp 5231–5237
Xu Y, Guo Q, Chen J (2018) Dynamic object detection using improved vibe for RGB-D SLAM. In: 2018 IEEE international conference on systems, man, and cybernetics (SMC). IEEE, pp 1664–1669
Yu C, Liu Z, Liu X J, et al (2018) Ds-slam: a semantic visual slam towards dynamic environments. In: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 1168–1174
Zhang Z, Zhang J, Tang Q (2019) Mask R-CNN based semantic RGB-D SLAM for dynamic scenes. In: 2019 IEEE/ASME international conference on advanced intelligent mechatronics (AIM). IEEE, pp1151–1156
Funding
The work is supported by the national Natural Science Foundation of China (Project No. 61773333), China Scholarship Council.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wen, S., Li, P., Zhao, Y. et al. Semantic visual SLAM in dynamic environment. Auton Robot 45, 493–504 (2021). https://doi.org/10.1007/s10514-021-09979-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-021-09979-4