A comprehensive overview of dynamic visual SLAM and deep learning: concepts, methods and challenges

Beghdadi, Ayman; Mallem, Malik

doi:10.1007/s00138-022-01306-w

A comprehensive overview of dynamic visual SLAM and deep learning: concepts, methods and challenges

Original Paper
Published: 31 May 2022

Volume 33, article number 54, (2022)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

2823 Accesses
7 Citations
2 Altmetric
Explore all metrics

Abstract

The visual SLAM (vSLAM) is a research topic that has been developing rapidly in recent years, especially with the renewed interest in machine learning and, more particularly, deep-learning-based approaches. Nowadays, main research is carried out to improve accuracy and robustness in complex and dynamic environments. This scorching topic has reached a significant level of maturity. This paper presents a relatively detailed and easily understood survey of vSLAM within deep learning. This study attempts to meet this challenge by better organizing the literature, explaining the basic concepts and tools, and presenting the current trends. The contributions of this study can be summarized in three essential steps. The first one is to provide the state-of-the-art in an incremental way following the classical processes of vSLAM-based systems. The second is to give our short- and medium-term view of the development of this very active and evolving field. Finally, we share our opinions on this subject and its interactions with new trends and, more particularly, the deep learning paradigm. We believe that this contribution will be an overview and, more importantly, a critical and detailed vision that serves as a roadmap in the field of vSLAMs both in terms of models and concepts and in terms of associated technologies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities

Article 08 September 2018

Ruihao Li, Sen Wang & Dongbing Gu

Deep Learning vs. Traditional Computer Vision

Visual SLAM Location Methods Based on Complex Scenes: A Review

References

Wolf, D., Sukhatme, G.S.: Online simultaneous localization and mapping in dynamic environments. In IEEE International Conference on Robotics and Automation, Proceedings. ICRA’04. 2004, vol. 2, pp. 1301–1307. IEEE (2004)
Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part i. IEEE Robot. Autom. Magaz. 13(2), 99–110 (2006)
Article Google Scholar
Wen, S., Zhao, Y., Yuan, X., Wang, Z., Zhang, D., Manfredi, L.: Path planning for active slam based on deep reinforcement learning under unknown environments. Intell. Serv. Robot. 13, 1–10 (2020)
Article Google Scholar
Kegeleirs, M., Grisetti, G., Birattari, M.: Swarm slam: Challenges and perspectives. Front. Robot. AI 8, 23 (2021)
Article Google Scholar
Smith, R., Self, M., Cheeseman, P.: Estimating uncertain spatial relationships in robotics. In: Autonomous Robot Vehicles, pp. 167–193. Springer (1990)
Leonard, J.J., Durrant-Whyte, H.F.: Simultaneous map building and localization for an autonomous mobile robot. In: IROS, vol. 3, pp. 1442–1447 (1991)
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: Monoslam: Real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007)
Article Google Scholar
Yang, Z., Shen, S.: Monocular visual-inertial state estimation with online initialization and camera-imu extrinsic calibration. IEEE Trans. Autom. Sci. Eng. 14(1), 39–51 (2017)
Article Google Scholar
Qin, T., Shen, S.: Robust initialization of monocular visual-inertial estimation on aerial robots. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4225–4232 (2017)
Campos, C., Elvira, R., Rodríguez, J.J.G., Montiel, J.M., Tardós, J.D.: Orb-slam3: An accurate open-source library for visual, visual-inertial and multi-map slam. arXiv preprint arXiv:2007.11898 (2020)
Forsyth, D.A., Ponce, J.: Computer Vision: A Modern Approach. Prentice Hall Professional Technical Reference (2002)
Klein, G., Murray, D.: Parallel tracking and mapping on a camera phone. In: 2009 8th IEEE International Symposium on Mixed and Augmented Reality, pp. 83–86 (2009)
Boucher, M., Ababsa, F., Mallem, M.: On depth usage for a lightened visual slam in small environments. Proc. Comput. Sci. 39, 28–34 (2014)
Article Google Scholar
Fuentes-Pacheco, J., Ruiz-Ascencio, J., Rendón-Mancha, J.M.: Visual simultaneous localization and mapping: a survey. Artif. Intell. Rev. 43(1), 55–81 (2015)
Article Google Scholar
Younes, G., Asmar, D., Shammas, E.: A survey on non-filter-based monocular visual slam systems. arXiv preprint arXiv:1607.00470, 413:414 (2016)
Taketomi, T., Uchiyama, H., Ikeda, S.: Visual slam algorithms: a survey from 2010 to 2016. IPSJ Trans. Comput. Vis. Appl. 9(1), 16 (2017)
Article Google Scholar
Huang, B., Zhao, J., Liu, J.: A survey of simultaneous localization and mapping. arXiv preprint arXiv:1909.05214 (2019)
Xia, L., Cui, J., Shen, R., Xun, X., Gao, Y., Li, X.: A survey of image semantics-based visual simultaneous localization and mapping: Application-oriented solutions to autonomous navigation of mobile robots. Int. J. Adv. Robot. Syst. 17(3), 1729881420919185 (2020)
Article Google Scholar
Zhong, F., Wang, S., Zhang, Z., Wang, Y.: Detect-slam: making object detection and slam mutually beneficial. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1001–1010. IEEE (2018)
Yu, C., Liu, Z., Liu, X.-J., Xie, F., Yang, Y., Wei, Q., Fei, Q.: Ds-slam: A semantic visual slam towards dynamic environments. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1168–1174. IEEE (2018)
Bescos, B., Fácil, J.M., Civera, J., Neira, J.: Dynaslam: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 3(4), 4076–4083 (2018)
Article Google Scholar
Se, S., Lowe, D., Little, J.: Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks. Int. J. Robot. Res. 21, 735–760 (2002)
Article Google Scholar
Harltey, A., Zisserman, A.: Multiple view geometry in computer vision (2. ed.). 01 (2006)
Nister, D.: An eifficient solution to the five-point relative pose problem. Proc. CVPR 2, 756–777 (2003)
Google Scholar
Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment-a modern synthesis. In: International Workshop on Vision Algorithms, pp. 298–372. Springer (1999)
Engels, C., Stewénius, H., Nistér, D.: Bundle adjustment rules. Photogram. Comput. Vis., 2(32), (2006)
Jurić, A., Kendeš, F., Marković, I., Petrović, I.: A comparison of graph optimization approaches for pose estimation in slam. In: 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO), pp. 1113–1118. IEEE (2021)
Nister, D., Naroditsky, O., Bergen, J.: Visual odometry. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004., vol. 1, pp. I–I (2004)
Raguram, R., Frahm, J.M.. Pollefeys, M.: A comparative analysis of ransac techniques leading to adaptive real-time random sample consensus. In: European Conference on Computer Vision, pp. 500–513. Springer (2008)
Qin, T., Li, P., Shen, S.: Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Trans. Robot. 34(4), 1004–1020 (2018)
Article Google Scholar
Qin, T., Shen, S.: Online temporal calibration for monocular visual-inertial systems. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3662–3669. IEEE (2018)
Liu, H., Chen, M., Zhang, G., Bao, H., Bao,Y.: Ice-ba: Incremental, consistent and efficient bundle adjustment for visual-inertial slam. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1974–1982 (2018)
Schneider, T., Dymczyk, M., Fehr, M., Egger, K., Lynen, S., Gilitschenski, I., Siegwart, R.: Maplab: An open framework for research in visual-inertial mapping and localization. IEEE Robot. Autom. Lett. 3, 1–1 (2018)
Article Google Scholar
Leutenegger, S., Lynen, S., Bosse, M., Siegwart, R., Furgale, P.: Keyframe-based visual-inertial odometry using nonlinear optimization. Int. J. Robot. Res. 34(3), 314–334 (2015)
Article Google Scholar
Martinelli, A.: Closed-form solution to cooperative visual-inertial structure from motion (2018). arXiv preprint arXiv:1802.08515
Kaiser, J., Martinelli, A., Fontana, F., Scaramuzza, D.: Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation. IEEE Robot. Autom. Lett. 2(1), 18–25 (2017)
Article Google Scholar
Martinelli, A., Siegwart, R.: Vision and imu data fusion: Closed-form determination of the absolute scale, speed and attitude (2012)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation (2014)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Jian, S.: Towards real-time object detection with region proposal networks, Faster r-cnn (2016)
Yang, S., Song, Y., Kaess, M., Scherer, S.: Pop-up slam: Semantic monocular plane slam for low-texture environments. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1222–1229. IEEE (2016)
Yang, S., Scherer, S.: Cubeslam: Monocular 3-d object slam. IEEE Trans. Robot. 35(4), 925–938 (2019)
Article Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969 (2018)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement (2018)
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Zou, Z., Shi, Z., Guo, Y., Ye, J.: Object detection in 20 years: A survey. arXiv preprint arXiv:1905.05055 (2019)
Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., Pietikäinen, M.: Deep learning for generic object detection: A survey. Int. J. Comput. Vis. 128(2), 261–318 (2020)
Article MATH Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected crfs (2016)
Badrinarayanan, V., Kendall, A., Cipolla, R.: A deep convolutional encoder-decoder architecture for image segmentation, Segnet (2016)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation (2018)
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp. 801–818 (2018)
Minaee, S., Boykov, Y.Y., Porikli, F., Plaza, A.J., Kehtarnavaz, N., Terzopoulos, D.: A survey. IEEE Trans. Pattern Anal. Mach. Intell., Image Segment. Deep Learn. (2021)
Hui, T.-W., Tang, X., Change, C., Liteflownet, L.: A lightweight convolutional neural network for optical flow estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8981–8989 (2018)
Sun, D., Yang, X., Liu, M.-Y., Kautz, J.: Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8934–8943 (2018)
Teed, Z., Deng, J.: Raft: Recurrent all-pairs field transforms for optical flow. In: European Conference on Computer Vision, pp. 402–419. Springer (2020)
Klein, G., Murray,D.: Parallel tracking and mapping for small ar workspaces. In: 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 225–234, (2007)
Mur-Artal, R., Montiel, J.M.M., Tardós, J.D.: Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)
Article Google Scholar
Mur-Artal, R., Tardós, J.D.: Fast relocalisation and loop closing in keyframe-based slam. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 846–853. IEEE, (2014)
Mur-Artal. R., Tardós, J.D.: Orb-slam: tracking and mapping recognizable features. In: Workshop on Multi View Geometry in Robotics (MVIGRO)-RSS, vol. 2014, p. 2 (2014)
Mur-Artal, R., Tardós, J.D.: Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)
Article Google Scholar
Sumikura, S., Shibuya, M., Sakurada, K.: Openvslam: A versatile visual slam framework. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2292–2295 (2019)
Munoz-Salinas, R., Medina-CarnicerL Ucoslam, R.: Simultaneous localization and mapping by fusion of keypoints and squared planar markers. Pattern Recog 101, 107193 (2020)
Article Google Scholar
Pfrommer, B., Daniilidis, K.: Tagslam: Robust slam with fiducial markers (2019)
Schlegel, D., Colosi, M., Grisetti,G.: Proslam: Graph slam from a programmer’s perspective. In: 2018 IEEE international conference on robotics and automation (ICRA), pp. 1–9. IEEE (2018)
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: Dtam: Dense tracking and mapping in real-time. In: 2011 International Conference on Computer Vision, pp. 2320–2327 (2011)
Forster, C., Pizzoli, M., Scaramuzza, D.: Svo: Fast semi-direct monocular visual odometry. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 15–22. IEEE (2014)
Forster, C., Zhang, Z., Gassner, M., Werlberger, M., Scaramuzza, D.: Svo: Semidirect visual odometry for monocular and multicamera systems. IEEE Trans. Robot. 33(2), 249–265 (2017)
Article Google Scholar
Engel, J., Stúckler, J., Cremers, D.: Large-scale direct slam with stereo cameras. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1935–1942 (2015)
Caruso, D., Engel, J., Cremers, D.: Large-scale direct slam for omnidirectional cameras. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 141–148 (2015)
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)
Article Google Scholar
Matsuki, H., von Stumberg, L., Usenko, V., Stückler, J., Cremers, D.: Omnidirectional dso: Direct sparse odometry with fisheye cameras. IEEE Robot. Autom. Lett. 3(4), 3693–3700 (2018)
Article Google Scholar
Wang, R., Schworer, M., Cremers, D.: Stereo dso: Large-scale direct sparse visual odometry with stereo cameras. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3903–3911 (2017)
Gao, X., Wang, R., Demmel, N., Daniel, C.: LDSO: Direct sparse odometry with loop closure. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 2198–2204 (2018)
Gálvez-López, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. 28(5), 1188–1197 (2012)
Article Google Scholar
Bloesch, M., Omari, S., Hutter, M., Siegwart, R.: Robust visual inertial odometry using a direct ekf-based approach. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 298–304. IEEE (2015)
Sun, K., Mohta, K., Pfrommer, B., Watterson, M., Liu, S., Mulgaonkar, Y., Taylor, C.J., Kumar, V.: Robust stereo visual inertial odometry for fast autonomous flight. IEEE Robot. Autom. Lett. 3(2), 965–972 (2018)
Article Google Scholar
Qin, T., Pan, J., Cao, S., Shen, S.: A general optimization-based framework for local odometry estimation with multiple sensors. arXiv preprint arXiv:1901.03638 (2019)
Mourikis, A.I., Roumeliotis, S.I.: A multi-state constraint kalman filter for vision-aided inertial navigation. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 3565–3572. IEEE (2007)
Forster, C., Carlone, L., Dellaert, F., Scaramuzza, D.: On-manifold preintegration for real-time visual-inertial odometry. IEEE Trans. Robot. 33(1), 1–21 (2016)
Article Google Scholar
Delmerico, J., Scaramuzza, D.: A benchmark comparison of monocular visual-inertial odometry algorithms for flying robots. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 2502–2509. IEEE (2018)
Yi, K.M., Trulls, E., Lepetit, V., Fua, P.: Lift: Learned invariant feature transform. In: European Conference on Computer Vision, pp. 467–483. Springer (2016)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236 (2018)
Liang, H.-J., Sanket, N.J., Fermüller, C., Aloimonos, Y.: Salientdso: Bringing attention to direct sparse odometry. IEEE Trans. Autom. Sci. Eng. 16(4), 1619–1626 (2019)
Article Google Scholar
Ganti, P., Waslander, S.: Network uncertainty informed semantic feature selection for visual slam. In: 2019 16th Conference on Computer and Robot Vision (CRV), pp. 121–128. IEEE (2019)
Tang, J., Ericson, L., Folkesson, J., Jensfelt, P.: Gcnv2: Efficient correspondence prediction for real-time slam. IEEE Robot. Autom. Lett. 4(4), 3505–3512 (2019)
Google Scholar
Salas-Moreno, R.F., Newcombe, R.A., Strasdat, H., Kelly, P.H.J., Davison, A.J.: Slam++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1352–1359 (2013)
Qin, Z., Wang, J., Yan, L.: Monogrnet: A geometric reasoning network for monocular 3d object localization. In Proc. AAAI Conf. Artif. Intell. 33, 8851–8858 (2019)
Google Scholar
Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 603–612 (2019)
Mohanty, V., Agrawal, S., Datta, S., Ghosh, A., Sharma, V.D., Chakravarty, D.: Deepvo: A deep learning approach for monocular visual odometry. arXiv preprint arXiv:1611.06069 (2016)
Tateno, K., Tombari, F., Laina, I., Navab, N.: Cnn-slam: Real-time dense monocular slam with learned depth prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6243–6252 (2017)
Li, R., Wang, S., Long, Z., Gu, D.: Undeepvo: Monocular visual odometry through unsupervised deep learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7286–7291. IEEE (2018)
Frost, D., Prisacariu, V., Murray, D.: Recovering stable scale in monocular slam using object-supplemented bundle adjustment. IEEE Trans. Robot. 34(3), 736–747 (2018)
Article Google Scholar
Bloesch, M., Czarnowski, J., Clark, R., Leutenegger, S., Davison, A.J.: Codeslam-learning a compact, optimisable representation for dense visual slam. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2560–2568 (2018)
Yin, Z., Shi, J.: Geonet: Unsupervised learning of dense depth, optical flow and camera pose. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1983–1992 (2018)
Yang, N., von Stumberg, L., Wang, R., Cremers, D.: D3vo: Deep depth, deep pose and deep uncertainty for monocular visual odometry. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1281–1292 (2020)
Godard, C., Mac Aodha, O., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3828–3838 (2019)
Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1851–1858 (2017)
Konda, K.R., Memisevic, R.: Learning visual odometry with a convolutional network. VISAPP 1, 486–490 (2015)
Google Scholar
Costante, G., Mancini, M., Valigi, P., Ciarfuglia, T.A.: Exploring representation learning with cnns for frame-to-frame ego-motion estimation. IEEE Robot. Autom. Lett. 1(1), 18–25 (2015)
Article Google Scholar
Wang, S., Clark, R., Wen, H., Trigoni, N.: Deepvo: Towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2043–2050. IEEE (2017)
Vijayanarasimhan, S., Ricco, S., Schmid, C., Sukthankar, R., Fragkiadaki, K.: Sfm-net: Learning of structure and motion from video. arXiv preprint arXiv:1704.07804 (2017)
Clark, R., Wang, S., Wen, H., Markham, A., Trigoni, N.: Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem. arXiv preprint arXiv:1701.08376 (2017)
Bowman, S.L., Atanasov, N., Daniilidis, K., Pappas, G.J.: Probabilistic data association for semantic slam. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1722–1729. IEEE (2017)
Gawel, A., Del Don, C., Siegwart, R., Nieto, J., Cadena, C.: X-view: Graph-based semantic multi-view localization. IEEE Robot. Autom. Lett. 3(3), 1687–1694 (2018)
Article Google Scholar
Stenborg, E., Toft, C., Hammarstrand, L.: Long-term visual localization using semantically segmented images. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6484–6490. IEEE (2018)
Merrill, N., Huang, G.: Lightweight unsupervised deep loop closure. arXiv preprint arXiv:1805.07703 (2018)
Doherty, K., Fourie, D., Leonard, J.: Multimodal semantic slam with probabilistic data association. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 2419–2425. IEEE (2019)
Wang, S., Clark, R., Wen, H., Trigoni, N.: End-to-end, sequence-to-sequence probabilistic visual odometry through deep neural networks. Int. J. Robot. Res. 37(4–5), 513–542 (2018)
Article Google Scholar
Ummenhofer, B., Zhou, H., Uhrig, J., Mayer, N., Ilg, E., Dosovitskiy, A., Demon, T.B.: Depth and motion network for learning monocular stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5038–5047 (2017)
Tan, W., Liu, H., Dong, Z., Zhang, G., Bao, H.: Robust monocular slam in dynamic environments. In: 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 209–218. IEEE (2013)
Liu, G., Zeng, W., Feng, B., Feng, X.: Dms-slam: A general visual slam system for dynamic scenes with multiple sensors. Sensors 19(17), 3714 (2019)
Article Google Scholar
Liu, H., Liu, G., Tian, G., Xin, S., Ji, Z.:Visual slam based on dynamic object removal. In: 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 596–601. IEEE (2019)
Cheng, J., Wang, Z., Zhou, H., Li, L., Yao, J.: Dm-slam: A feature-based slam system for rigid dynamic scenes. ISPRS Int. J. Geo-Inform. 9(4), 202 (2020)
Article Google Scholar
Ai, Y.-B., Rui, T., Yang, X.-Q., He, J.-L., Fu, L., Li, J.-B., Lu, M.: Visual slam in dynamic environments based on object detection. Defence Technology (2020)
Bescos, B., Campos, C., Tardós, J.D., Neira, J.: Dynaslam ii: Tightly-coupled multi-object tracking and slam. arXiv preprint arXiv:2010.07820 (2020)
Ballester, I., Fontan, A., Civera, J., Strobl, K.H., Triebel, R.: Dot: dynamic object tracking for visual slam. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 11705–11711. IEEE (2021)
Duane, C.B.: Close-range camera calibration. Photogramm. Eng 37(8), 855–866 (1971)
Google Scholar
Tsai, R.: A versatile camera calibration technique for high-accuracy 3d machine vision metrology using off-the-shelf tv cameras and lenses. IEEE J. Robot. Autom. 3(4), 323–344 (1987)
Article Google Scholar
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000)
Article Google Scholar
Zhang, Z., Schenk, V.: Self-maintaining camera calibration over time. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 231–236. IEEE (1997)
Mendelsohn, J., Daniilidis, K.: Constrained self-calibration. In: Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149), vol. 2, pp. 581–587. IEEE (1999)
Malis, E., Cipolla, R.: Self-calibration of zooming cameras observing an unknown planar structure. In: Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, vol 1, pp. 85–88. IEEE (2000)
Andrews, H.C.: Boby Ray Hunt: Digital image restoration. (1977)
Figueiredo, M.A.T., Nowak, R.D.: An em algorithm for wavelet-based image restoration. IEEE Trans. Image Process. 12(8), 906–916 (2003)
Article MathSciNet MATH Google Scholar
Tai, Y., Yang, J., Liu, X., Xu, C.: Memnet: A persistent memory network for image restoration. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4539–4547 (2017)
Zhang. K., Zuo, W., Gu, S., Zhang, L.: Learning deep cnn denoiser prior for image restoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3929–3938 (2017)
Yan, C., Li, Z., Zhang, Y., Liu, Y., Ji, X., Zhang, Y.: Depth image denoising using nuclear norm and learning graph model. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 16(4), 1–17 (2020)
Article Google Scholar
Kumar, M.P., Koller, D.: Efficiently selecting regions for scene understanding. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3217–3224. IEEE (2010)
Dvornik, N., Shmelkov, K., Mairal, J., Schmid, C.: Blitznet: A real-time deep network for scene understanding. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4154–4162 (2017)
Xiao, T., Liu, Y., Zhou, B., Jiang, Y., Sun, J.: Unified perceptual parsing for scene understanding. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 418–434 (2018)
Sakaridis, C., Dai, D., Van Gool, L.: Semantic foggy scene understanding with synthetic data. Int. J. Computer Vis. 126(9), 973–992 (2018)
Article Google Scholar
Jaritz, M., Gu, J., Su, H.: Multi-view pointnet for 3d scene understanding. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 0–0 (2019)
Yan, C., Shao, B., Zhao, H., Ning, R., Zhang, Y., Feng, X.: 3d room layout estimation from a single rgb image. IEEE Trans. Multimed. 22(11), 3014–3024 (2020)
Article Google Scholar
Zhang, T., Zhang, H., Li, Y., Nakamura, Y., Zhang, L.: Flowfusion: Dynamic dense rgb-d slam based on optical flow. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 7322–7328 (2020)
Liu, Y., Miura, J.: Rds-slam: Real-time dynamic slam using semantic segmentation methods. IEEE Access 9, 23772–23785 (2021)
Article Google Scholar
Jiao, L., Zhang, F., Liu, F., Yang, S., Li, L., Feng, Z., Rong, Q.: A survey of deep learning-based object detection. IEEE Access 7, 128837–128868 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

IBISC Lab, Univ Evry, Paris-Saclay University, 91025, Evry, France
Ayman Beghdadi & Malik Mallem

Authors

Ayman Beghdadi
View author publications
You can also search for this author in PubMed Google Scholar
Malik Mallem
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ayman Beghdadi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Beghdadi, A., Mallem, M. A comprehensive overview of dynamic visual SLAM and deep learning: concepts, methods and challenges. Machine Vision and Applications 33, 54 (2022). https://doi.org/10.1007/s00138-022-01306-w

Download citation

Received: 15 February 2021
Revised: 23 February 2022
Accepted: 26 April 2022
Published: 31 May 2022
DOI: https://doi.org/10.1007/s00138-022-01306-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive overview of dynamic visual SLAM and deep learning: concepts, methods and challenges

Abstract

Access this article

Similar content being viewed by others

Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities

Deep Learning vs. Traditional Computer Vision

Visual SLAM Location Methods Based on Complex Scenes: A Review

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A comprehensive overview of dynamic visual SLAM and deep learning: concepts, methods and challenges

Abstract

Access this article

Similar content being viewed by others

Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities

Deep Learning vs. Traditional Computer Vision

Visual SLAM Location Methods Based on Complex Scenes: A Review

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation