Skip to main content
Log in

Visual localization under appearance change: filtering approaches

  • S.I. : DICTA 2019
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

A major focus of current research on place recognition is visual localization for autonomous driving. In this scenario, as cameras will be operating continuously, it is realistic to expect videos as an input to visual localization algorithms, as opposed to the single-image querying approach used in other visual localization works. In this paper, we show that exploiting temporal continuity in the testing sequence significantly improves visual localization—qualitatively and quantitatively. Although intuitive, this idea has not been fully explored in recent works. To this end, we propose two filtering approaches to exploit the temporal smoothness of image sequences: (i) filtering on discrete domain with hidden Markov model, and (ii) filtering on continuous domain with Monte Carlo-based visual localization. Our approaches rely on local features with an encoding technique to represent an image as a single vector. The experimental results on synthetic and real datasets show that our proposed methods achieve better results than state of the art (i.e., deep learning-based pose regression approaches) for the task on visual localization under significant appearance change. Our synthetic dataset and source code are made publicly available (https://sites.google.com/view/g2d-software/home; https://github.com/dadung/Visual-Localization-Filtering).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. In more “localized” operations such as parking, where highly accurate 6 DoF estimation is required, it is probably better to rely on the INS.

  2. More fundamentally, the car is a nonholonomic system [1].

  3. On uneven or hilly roads, accelerometers can be used to estimate the vertical motion; hence, VL can focus on map-scale navigation.

  4. The method of [6] will give ambiguous results on noninformative trajectories, e.g., largely straight routes. Hence, VL is still crucial.

  5. Based on Intel i7-6700 @ 3.40GHz, RAM 16GB, NVIDIA GeForce GTX 1080 Ti and the highest graphical configuration for GTA V.

References

  1. Wikipedia (2020) Nonholonomic system. In: https://en.wikipedia.org/wiki/Nonholonomic_system

  2. Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2016) NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 5297–5307

  3. Arandjelovic R, Zisserman A (2012) Three things everyone should know to improve object retrieval. In: CVPR

  4. Brachmann E, Krull A, Nowozin S, Shotton J, Michel F, Gumhold S, Rother C (2017) DSAC-differentiable RANSAC for camera localization. In: CVPR

  5. Brahmbhatt S, Gu J, Kim K, Hays J, Kautz J (2018) Geometry-aware learning of maps for camera localization. In: CVPR

  6. Brubaker MA, Geiger A, Urtasun R (2013) Lost! leveraging the crowd for probabilistic visual self-localization. In: CVPR

  7. Bustos AP, Chin TJ, Eriksson A, Reid I (2019) Visual slam: Why bundle adjust? In: ICRA

  8. Churchill W, Newman P (2013) Experience-based navigation for long-term localisation. Int J Robotics Res 32:1645

    Article  Google Scholar 

  9. Do TT, Tran QD, Cheung NM (2015) FAemb: a function approximation-based embedding method for image retrieval. In: CVPR

  10. Doan AD, Jawaid AM, Do TT, Chin TJ (2018) G2D: from GTA to Data. arXiv preprint arXiv:1806.07381 pp 1–9

  11. Doan AD, Latif Y, Chin TJ, Liu Y, Do TT, Reid I (2019) Scalable place recognition under appearance change for autonomous driving. In: ICCV

  12. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR

  13. Jégou H, Chum O (2012) Negative evidences and co-occurrences in image retrieval: the benefit of PCA and whitening. In: ECCV

  14. Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: CVPR

  15. Jégou H, Zisserman A (2014) Triangulation embedding and democratic aggregation for image search. In: CVPR

  16. Junkins JL, Schaub H (2009) Analytical mechanics of space systems. American Institute of Aeronautics and Astronautics, Reston

    MATH  Google Scholar 

  17. Kendall A, Cipolla R (2016) Modelling uncertainty in deep learning for camera relocalization. In: ICRA

  18. Kendall A, Cipolla R, et al. (2017) Geometric loss functions for camera pose regression with deep learning. In: CVPR

  19. Kendall A, Grimes M, Cipolla R (2015) Posenet: a convolutional network for real-time 6-dof camera relocalization. In: CVPR

  20. Ko J, Fox D (2009) GP-Bayesfilters: Bayesian filtering using Gaussian process prediction and observation models. Auton Robots 27:75

    Article  Google Scholar 

  21. Krähenbühl P (2018) Free supervision from video games. In: CVPR

  22. Lepetit V, Moreno-Noguer F, Fua P (2009) EPnP: an accurate o(n) solution to the PnP problem. IJCV 81:155

    Article  Google Scholar 

  23. Maddern W, Pascoe G, Linegar C, Newman P (2017) 1 year, 1000 km: the Oxford robotcar dataset. Int J Robotics Res 36:3

    Article  Google Scholar 

  24. Markley FL, Cheng Y, Crassidis JL, Oshman Y (2007) Averaging quaternions. J Guid Control Dyn 30:1193

    Article  Google Scholar 

  25. Menegatti E, Zoccarato M, Pagello E, Ishiguro H (2004) Image-based Monte Carlo localisation with omnidirectional images. Robotics Auton Syst 48:17

    Article  Google Scholar 

  26. Milford MJ, Wyeth GF (2012) SeqSLAM: visual route-based navigation for sunny summer days and stormy winter nights. In: ICRA

  27. Murray N, Perronnin F (2014) Generalized max pooling. In: CVPR

  28. Richter SR, Hayder Z, Koltun V (2017) Playing for benchmarks. In: ICCV

  29. Rubino C, Del Bue A, Chin TJ (2018) Practical motion segmentation for urban street view scenes. In: ICRA

  30. Sattler T, Leibe B, Kobbelt L (2017) Efficient & effective prioritized matching for large-scale image-based localization. TPAMI 39:1744–1756

    Article  Google Scholar 

  31. Sattler T, Maddern W, Toft C, Torii A, Hammarstrand L, Stenborg E, Safari D, Okutomi M, Pollefeys M, Sivic J, et al. (2018) Benchmarking 6DOF outdoor visual localization in changing conditions. In: CVPR

  32. Schonberger JL, Frahm JM (2016) Structure-from-motion revisited. In: CVPR

  33. Schönberger JL, Pollefeys M, Geiger A, Sattler T (2018) Semantic visual localization. In: CVPR

  34. Sünderhauf N, Neubert P, Protzel P (2013) Are we there yet? challenging SeqSLAM on a 3000 km journey across all four seasons. In: ICRA workshop on long-term autonomy

  35. Torii A, Arandjelovic R, Sivic J, Okutomi M, Pajdla T (2015) 24/7 place recognition by view synthesis. In: CVPR

  36. Tran NT, Le Tan DK, Doan AD, Do TT, Bui TA, Tan M, Cheung NM (2019) On-device scalable image-based localization via prioritized cascade search and fast one-many ransac. TIP 28:1675

    MathSciNet  Google Scholar 

  37. Tremblay J, Prakash A, Acuna D, Brophy M, Jampani V, Anil C, To T, Cameracci E, Boochoon S, Birchfield S (2018) Training deep networks with synthetic data: bridging the reality gap by domain randomization. In: CVPR workshop on autonomous driving

  38. Walch F, Hazirbas C, Leal-Taixe L, Sattler T, Hilsenbeck S, Cremers D (2017) Image-based localization using LSTMS for structured feature correlation. In: ICCV

  39. Wang P, Huang X, Cheng X, Zhou D, Geng Q, Yang R (2019) The ApolloScape open dataset for autonomous driving and its application. TPAMI 42:2702–2719

    Google Scholar 

  40. Whitley D (1994) A genetic algorithm tutorial. Stat Comput 4:65

    Article  Google Scholar 

  41. Wolf, J., Burgard, W., Burkhardt, H.: Robust vision-based localization for mobile robots using an image retrieval system based on invariant features. In: ICRA (2002)

  42. Wolf J, Burgard W, Burkhardt H (2005) Robust vision-based localization by combining an image-retrieval system with Monte Carlo localization. IEEE Trans Robotics 21:208

    Article  Google Scholar 

  43. Yu K, Zhang T (2010) Improved local coordinate coding using local tangents. In: ICML

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anh-Dzung Doan.

Ethics declarations

Conflict of interest

We declare that there is no potential conflict of interest for this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Doan, AD., Latif, Y., Chin, TJ. et al. Visual localization under appearance change: filtering approaches. Neural Comput & Applic 33, 7325–7338 (2021). https://doi.org/10.1007/s00521-020-05339-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05339-y

Keywords

Navigation