Skip to main content
Log in

A review of monocular visual odometry

  • Survey
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Monocular visual odometry provides more robust functions on navigation and obstacle avoidance for mobile robots than other visual odometries, such as binocular visual odometry, RGB-D visual odometry and basic odometry. This paper describes the problem of visual odometry and also determines the relationships between visual odometry and visual simultaneous localization and mapping (SLAM). The basic principle of visual odometry is expressed in the form of mathematics, specifically by incrementally solving the pose changes of two series of frames and further improving the odometry through global optimization. After analyzing the three main ways of implementing visual odometry, the state-of-the-art monocular visual odometries, including ORB-SLAM2, DSO and SVO, are also analyzed and compared in detail. The issues of robustness and real-time operations, which are generally of interest in the current visual odometry research, are discussed from the future development of the directions and trends. Furthermore, we present a novel framework for the implementation of next-generation visual odometry based on additional high-dimensional features, which have not been implemented in the relevant applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)

    Article  Google Scholar 

  2. Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part II. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)

    Article  Google Scholar 

  3. Nistér, D., Naroditsky, O., Bergen, J.: Visual odometry. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004 (CVPR 2004), vol. 1, pp. I–I. IEEE (2004)

  4. Zhu, C., He, M., et al.: A survey of monocular visual odometry. Comput. Eng. Appl. 54(07), 20–28+55 (2018). (in Chinese with English abstract)

    Google Scholar 

  5. Lin, S., Chen, Y., Lai, Y.K., Martin, R.R., Cheng, Z.Q.: Fast capture of textured full-body avatar with rgb-d cameras. Vis. Comput. 32(6–8), 681–691 (2016)

    Article  Google Scholar 

  6. Sharma, O., Pandey, J., Akhtar, H., Rathee, G.: Navigation in AR based on digital replicas. Vis. Comput. 34(6–8), 925–936 (2018)

    Article  Google Scholar 

  7. Teng, C.H., Chuo, K.Y., Hsieh, C.Y.: Reconstructing three-dimensional models of objects using a kinect sensor. Vis. Comput. 34, 1507–1523 (2018)

    Article  Google Scholar 

  8. Bloesch, M., Omari, S., Hutter, M.,Siegwart, R..: Robust visual inertial odometry using a direct EKF-based approach. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 298–304. IEEE (2015)

  9. Kai, W., Liwei, L., Yong, L., Peng, D., Guoting, X.: Application research of chaotic carrier frequency modulation technology in two-stage matrix converter. Math. Probl. Eng. 2019(2614327), 8 (2019). https://doi.org/10.1155/2019/2614327

    Article  Google Scholar 

  10. Qin, T., Li, P., Shen, S.: Vins-mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans. Robot. 34(4), 1004–1020 (2018)

    Article  Google Scholar 

  11. Strasdat, H., Montiel, J.M., Davison, A.J.: Visual SLAM: Why filter? Image Vis. Comput. 30(2), 65–77 (2012)

    Article  Google Scholar 

  12. Strasdat, H., Montiel, J.M.M., Davison, A.J.: Real-time monocular SLAM: Why filter?. In: 2010 IEEE International Conference on Robotics and Automation (ICRA), pp. 2657–2664. IEEE (2010)

  13. Kai, W., JinBo, P., LiWei, L., Shengzhe, Z., Yuhao, L., Tiezhu, Z.: Synthesis of hydrophobic carbon nanotubes/reduced graphene oxide composite films by flash light irradiation. Front. Chem. Sci. Eng. 12(3), 376–382 (2018)

    Article  Google Scholar 

  14. Kai, W., ShengZhe, Z., YanTing, Z., Jun, R., LiWei, L., Yong, L.: Synthesis of porous carbon by activation method and its electrochemical performance. Int. J. Electrochem. Sci. 13(11), 10766–10773 (2018)

    Article  Google Scholar 

  15. Mei, C., Sibley, G., Cummins, M., et al.: RSLAM: a system for large-scale mapping in constant-time using stereo. Int. J. Comput. Vision 94(2), 198–214 (2011)

    Article  Google Scholar 

  16. MMur-Artal, R., Tardós, J.D.: Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)

    Article  Google Scholar 

  17. Gao, X., Zhang, T., Liu, Y., Yan, Q.: Lectures on Visual SLAM: From Theory to Practice. Publishing House of Electronics Industry, Beijing (2017)

    Google Scholar 

  18. Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: Real-time single camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 6, 1052–1067 (2007)

    Article  Google Scholar 

  19. Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, 2007 (ISMAR 2007), pp. 225–234. IEEE (2007)

  20. Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: dense tracking and mapping in real-time. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2320–2327. IEEE (2011)

  21. Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., Fitzgibbon, A.: KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, pp. 559–568. ACM (2011)

  22. Kerl, C., Sturm, J., Cremers, D.: Dense visual SLAM for RGB-D cameras. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2100–2106. IEEE (2013)

  23. Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: Fast semi-direct monocular visual odometry. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 15–22. IEEE (2014)

  24. Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: European Conference on Computer Vision, pp. 834–849. Springer, Cham (2014)

  25. Bloesch, M., Burri, M., Omari, S., Hutter, M., Siegwart, R.: Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback. Int. J. Robot. Res. 36(10), 1053–1072 (2017)

    Article  Google Scholar 

  26. Whelan, T., Salas-Moreno, R.F., Glocker, B., Davison, A.J., Leutenegger, S.: ElasticFusion: real-time dense SLAM and light source estimation. Int. J. Robot. Res. 35(14), 1697–1716 (2016)

    Article  Google Scholar 

  27. Whelan, T., Salas-Moreno, R.F., Glocker, B., Davison, A.J., Leutenegger, S.: ElasticFusion: dense SLAM without a pose graph. Int. J. Robot. Res. 35(14), 1–9 (2016)

    Article  Google Scholar 

  28. Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)

    Article  Google Scholar 

  29. Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)

    Article  Google Scholar 

  30. Schneider, T., Dymczyk, M., Fehr, M., Egger, K., Lynen, S., Gilitschenski, I., et al.: Maplab: an open framework for research in visual-inertial mapping and localization. IEEE Robot. Autom. Lett. 3(3), 1418–1425 (2018)

    Article  Google Scholar 

  31. Konolige, K., Agrawal, M., Sola, J.: Large-scale visual odometry for rough terrain. In: Kaneko, M., Nakamura, Y. (eds.) Robotics Research, pp. 201–212. Springer, Berlin (2010)

    Chapter  Google Scholar 

  32. Quijada, S.D., Zalama, E., Garcí-Bermejo, J.G., Worst, R., Behnke, S.: Fast 6D odometry based on visual features and depth. In: Lee, S., Cho, H., Yoon, K.J., Lee, J. (eds.) Intelligent Autonomous Systems 12, pp. 245–256. Springer, Berlin (2013)

    Chapter  Google Scholar 

  33. Tang, C., Wang, O., Tan, P.: GlobalSLAM: initialization-robust monocular visual SLAM (2017). arXiv preprint arXiv:1708.04814

  34. Scaramuzza, D., Fraundorfer, F.: Visual odometry [tutorial]. IEEE Robot. Autom. Mag. 18(4), 80–92 (2011)

    Article  Google Scholar 

  35. Hartley, R.I.: In defense of the eight-point algorithm. IEEE Pami 19(6), 580–593 (1997)

    Article  Google Scholar 

  36. Besl, P.J., McKay, N.D.: Method for registration of 3-D shapes. In: Sensor Fusion IV: Control Paradigms and Data Structures, vol. 1611, pp. 586–607. International Society for Optics and Photonics (1992)

  37. Persson, M., Nordberg, K.: Lambda twist: an accurate fast robust perspective three point (P3P) solver. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 318–332 (2018)

  38. Altantsetseg, E., Khorloo, O., Konno, K.: Rigid registration of noisy point clouds based on higher-dimensional error metrics. Vis. Comput. 34(6–8), 1021–1030 (2018)

    Article  Google Scholar 

  39. Kang, H.Y., Han, J.: Feature-preserving procedural texture. Vis. Comput. 33(6–8), 761–768 (2017)

    Article  Google Scholar 

  40. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  41. Bay, H., Tuytelaars, T., Van Gool, L.: Surf: speeded up robust features. In: European Conference on Computer Vision, pp. 404–417. Springer, Berlin (2006)

  42. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 IEEE international conference on computer vision (ICCV), pp. 2564–2571. IEEE (2011)

  43. Leutenegger, S., Chli, M., Siegwart, R.Y.: BRISK: binary robust invariant scalable keypoints. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2548–2555. IEEE (2011)

  44. Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: European Conference on Computer Vision, pp. 430–443. Springer, Berlin (2006)

  45. Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: European Conference on Computer Vision (2010)

  46. Muller, P., Savakis, A.: Flowdometry: an optical flow and deep learning based approach to visual odometry. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 624–631. IEEE (2017)

  47. Baker, S., Matthews, I.: Lucas-Kanade 20 years on: a unifying framework. Int. J. Comput. Vis. 56(3), 221–255 (2004)

    Article  Google Scholar 

  48. Lu, F., Zhou, B., Zhang, Y., Zhao, Q.: Real-time 3d scene reconstruction with dynamically moving object using a single depth camera. Vis. Comput. 34, 753–763 (2018)

    Article  Google Scholar 

  49. Jin, H.L., Favaro, P., Soatto, S.: A semi-direct approach to structure from motion. Vis. Comput. 19(6), 377–394 (2003)

    Article  Google Scholar 

  50. Zhou, Y., Yan, F., Zhou, Z.: Handling pure camera rotation in semi-dense monocular SLAM. Vis. Comput. 35, 123 (2019). https://doi.org/10.1007/s00371-017-1435-0

    Article  Google Scholar 

  51. Silveira, G., Malis, E., Rives, P.: An efficient direct approach to visual slam. IEEE Trans. Robot. 24(5), 969–979 (2008)

    Article  Google Scholar 

  52. Pizzoli, M., Forster, C., Scaramuzza, D.: REMODE: probabilistic, monocular dense reconstruction in real time. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 2609–2616. IEEE (2014)

  53. Engel, J., Sturm, J., Cremers, D.: Semi-dense visual odometry for a monocular camera. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1449–1456 (2013)

  54. Vogiatzis, G., Hernandez, C.: Video-based, real-time multi-view stereo. Image Vis. Comput. 29(7), 434–441 (2011)

    Article  Google Scholar 

  55. Forster, C., Zhang, Z., Gassner, M., Werlberger, M., Scaramuzza, D.: Svo: semidirect visual odometry for monocular and multicamera systems. IEEE Trans. Robot. 33(2), 249–265 (2017)

    Article  Google Scholar 

  56. Lu, R., Zhu, F., Wu, Q., Fu, X.: Search inliers based on redundant geometric constraints. Vis. Comput. (2018). https://doi.org/10.1007/s00371-018-1605-8

  57. Zhu, A.Z., Atanasov, N., Daniilidis, K..: Event-based visual inertial odometry. In: CVPR, pp. 5816–5824 (2017)

  58. Lin, Y., Gao, F., Qin, T., Gao, W., Liu, T., Wu, W., Zhenfei, Y., Shen, S.: Autonomous aerial navigation using monocular visual-inertial fusion. J. Field Robot. 35(1), 23–51 (2018)

    Article  Google Scholar 

  59. Gui, J., Gu, D., Wang, S., Hu, H.: A review of visual inertial odometry from filtering and optimisation perspectives. Adv. Robot. 29(20), 1289–1301 (2015)

    Article  Google Scholar 

  60. Weiss, S., Achtelik, M.W., Lynen, S., Chli, M., Siegwart, R.: Real-time onboard visual-inertial state estimation and self-calibration of MAVs in unknown environments. In: 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 957–964. IEEE (2012)

  61. Weiss, S., Achtelik, M.W., Lynen, S., Chli, M., Siegwart, R.: Real-time onboard visual-inertial state estimation and self-calibration of MAVs in unknown environments. In: IEEE International Conference on Robotics & Automation. IEEE (2013)

  62. Ranganathan, A., Kaess, M., Dellaert, F.: Fast 3D pose estimation with out-of-sequence measurements. In: IEEE/RSJ International Conference on Intelligent Robots & Systems. IEEE (2007)

  63. Yang, S., Scherer, S.A., Yi, X., Zell, A.: Multi-camera visual SLAM for autonomous navigation of micro aerial vehicles. Robot. Auton. Syst. 93, 116–134 (2017)

    Article  Google Scholar 

  64. Usenko, V., Engel, J., Stückler, J., Cremers, D.: Direct visual-inertial odometry with stereo cameras. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 1885–1892. IEEE (2016)

  65. Shetty, A.P.: GPS-LiDAR sensor fusion aided by 3D city models for UAVs (Doctoral dissertation) (2017)

  66. Zeng, A., Song, S., Nießner, M., Fisher, M., Xiao, J., Funkhouser, T..: 3dmatch: learning local geometric descriptors from rgb-d reconstructions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 199–208. IEEE (2017)

  67. Shaked, A., Wolf, L.: Improved stereo matching with constant highway networks and reflective confidence learning. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4641–4650 (2017)

  68. Tateno, K., Tombari, F., Laina, I., Navab, N.: CNN-SLAM: real-time dense monocular SLAM with learned depth prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2 (2017)

  69. Gomez-Ojeda, R., Zhang, Z., Gonzalez-Jimenez, J., Scaramuzza, D.: Learning-based image enhancement for visual odometry in challenging HDR environments. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 805–811. IEEE (2018)

  70. Li, R., Wang, S., Long, Z., Gu, D.: Undeepvo: monocular visual odometry through unsupervised deep learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7286–7291. IEEE (2018)

  71. Gao, X., Zhang, T.: Robust RGB-D simultaneous localization and mapping using planar point features. Robot. Auton. Syst. 72, 1–14 (2015)

    Article  Google Scholar 

  72. Yang, S., Scherer, S.: Direct monocular odometry using points and lines (2017). arXiv preprint arXiv:1703.06380

  73. Li, S.J., Ren, B., Liu, Y., Cheng, M.M., Frost, D., Prisacariu, V.A.: Direct line guidance odometry. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–7. IEEE (2018)

  74. Wang, T., Ling, H.: Gracker: a graph-based planar object tracker. IEEE Trans. Pattern Anal. Machine Intell. 40(6), 1494–1501 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Key R&D Program of China Nos. 2018YFC0806900, 2016YFC0800606, 2016YFC0800310 and 2018YFC0407905; Natural Science Foundation of Jiangsu Province under Grants No. BK20161469; Primary Research & Development Plan of Jiangsu Province under Grant Nos. BE2016904, BE2017616, and BE2018754.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chaozheng Zhu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, M., Zhu, C., Huang, Q. et al. A review of monocular visual odometry. Vis Comput 36, 1053–1065 (2020). https://doi.org/10.1007/s00371-019-01714-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-019-01714-6

Keywords

Navigation