Visual localization under appearance change: filtering approaches

Doan, Anh-Dzung; Latif, Yasir; Chin, Tat-Jun; Liu, Yu; Ch’ng, Shin-Fang; Do, Thanh-Toan; Reid, Ian

doi:10.1007/s00521-020-05339-y

Visual localization under appearance change: filtering approaches

S.I. : DICTA 2019
Published: 17 September 2020

Volume 33, pages 7325–7338, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Anh-Dzung Doan ORCID: orcid.org/0000-0001-5517-070X¹,
Yasir Latif¹,
Tat-Jun Chin¹,
Yu Liu¹,
Shin-Fang Ch’ng¹,
Thanh-Toan Do² &
…
Ian Reid¹

500 Accesses
5 Citations
Explore all metrics

Abstract

A major focus of current research on place recognition is visual localization for autonomous driving. In this scenario, as cameras will be operating continuously, it is realistic to expect videos as an input to visual localization algorithms, as opposed to the single-image querying approach used in other visual localization works. In this paper, we show that exploiting temporal continuity in the testing sequence significantly improves visual localization—qualitatively and quantitatively. Although intuitive, this idea has not been fully explored in recent works. To this end, we propose two filtering approaches to exploit the temporal smoothness of image sequences: (i) filtering on discrete domain with hidden Markov model, and (ii) filtering on continuous domain with Monte Carlo-based visual localization. Our approaches rely on local features with an encoding technique to represent an image as a single vector. The experimental results on synthetic and real datasets show that our proposed methods achieve better results than state of the art (i.e., deep learning-based pose regression approaches) for the task on visual localization under significant appearance change. Our synthetic dataset and source code are made publicly available (https://sites.google.com/view/g2d-software/home; https://github.com/dadung/Visual-Localization-Filtering).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual-Based Positioning and Pose Estimation

Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities

Article 08 September 2018

VPR-Bench: An Open-Source Visual Place Recognition Evaluation Framework with Quantifiable Viewpoint and Appearance Change

Article Open access 07 May 2021

Notes

In more “localized” operations such as parking, where highly accurate 6 DoF estimation is required, it is probably better to rely on the INS.
More fundamentally, the car is a nonholonomic system [1].
On uneven or hilly roads, accelerometers can be used to estimate the vertical motion; hence, VL can focus on map-scale navigation.
The method of [6] will give ambiguous results on noninformative trajectories, e.g., largely straight routes. Hence, VL is still crucial.
Based on Intel i7-6700 @ 3.40GHz, RAM 16GB, NVIDIA GeForce GTX 1080 Ti and the highest graphical configuration for GTA V.

References

Wikipedia (2020) Nonholonomic system. In: https://en.wikipedia.org/wiki/Nonholonomic_system
Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2016) NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 5297–5307
Arandjelovic R, Zisserman A (2012) Three things everyone should know to improve object retrieval. In: CVPR
Brachmann E, Krull A, Nowozin S, Shotton J, Michel F, Gumhold S, Rother C (2017) DSAC-differentiable RANSAC for camera localization. In: CVPR
Brahmbhatt S, Gu J, Kim K, Hays J, Kautz J (2018) Geometry-aware learning of maps for camera localization. In: CVPR
Brubaker MA, Geiger A, Urtasun R (2013) Lost! leveraging the crowd for probabilistic visual self-localization. In: CVPR
Bustos AP, Chin TJ, Eriksson A, Reid I (2019) Visual slam: Why bundle adjust? In: ICRA
Churchill W, Newman P (2013) Experience-based navigation for long-term localisation. Int J Robotics Res 32:1645
Article Google Scholar
Do TT, Tran QD, Cheung NM (2015) FAemb: a function approximation-based embedding method for image retrieval. In: CVPR
Doan AD, Jawaid AM, Do TT, Chin TJ (2018) G2D: from GTA to Data. arXiv preprint arXiv:1806.07381 pp 1–9
Doan AD, Latif Y, Chin TJ, Liu Y, Do TT, Reid I (2019) Scalable place recognition under appearance change for autonomous driving. In: ICCV
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR
Jégou H, Chum O (2012) Negative evidences and co-occurrences in image retrieval: the benefit of PCA and whitening. In: ECCV
Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: CVPR
Jégou H, Zisserman A (2014) Triangulation embedding and democratic aggregation for image search. In: CVPR
Junkins JL, Schaub H (2009) Analytical mechanics of space systems. American Institute of Aeronautics and Astronautics, Reston
MATH Google Scholar
Kendall A, Cipolla R (2016) Modelling uncertainty in deep learning for camera relocalization. In: ICRA
Kendall A, Cipolla R, et al. (2017) Geometric loss functions for camera pose regression with deep learning. In: CVPR
Kendall A, Grimes M, Cipolla R (2015) Posenet: a convolutional network for real-time 6-dof camera relocalization. In: CVPR
Ko J, Fox D (2009) GP-Bayesfilters: Bayesian filtering using Gaussian process prediction and observation models. Auton Robots 27:75
Article Google Scholar
Krähenbühl P (2018) Free supervision from video games. In: CVPR
Lepetit V, Moreno-Noguer F, Fua P (2009) EPnP: an accurate o(n) solution to the PnP problem. IJCV 81:155
Article Google Scholar
Maddern W, Pascoe G, Linegar C, Newman P (2017) 1 year, 1000 km: the Oxford robotcar dataset. Int J Robotics Res 36:3
Article Google Scholar
Markley FL, Cheng Y, Crassidis JL, Oshman Y (2007) Averaging quaternions. J Guid Control Dyn 30:1193
Article Google Scholar
Menegatti E, Zoccarato M, Pagello E, Ishiguro H (2004) Image-based Monte Carlo localisation with omnidirectional images. Robotics Auton Syst 48:17
Article Google Scholar
Milford MJ, Wyeth GF (2012) SeqSLAM: visual route-based navigation for sunny summer days and stormy winter nights. In: ICRA
Murray N, Perronnin F (2014) Generalized max pooling. In: CVPR
Richter SR, Hayder Z, Koltun V (2017) Playing for benchmarks. In: ICCV
Rubino C, Del Bue A, Chin TJ (2018) Practical motion segmentation for urban street view scenes. In: ICRA
Sattler T, Leibe B, Kobbelt L (2017) Efficient & effective prioritized matching for large-scale image-based localization. TPAMI 39:1744–1756
Article Google Scholar
Sattler T, Maddern W, Toft C, Torii A, Hammarstrand L, Stenborg E, Safari D, Okutomi M, Pollefeys M, Sivic J, et al. (2018) Benchmarking 6DOF outdoor visual localization in changing conditions. In: CVPR
Schonberger JL, Frahm JM (2016) Structure-from-motion revisited. In: CVPR
Schönberger JL, Pollefeys M, Geiger A, Sattler T (2018) Semantic visual localization. In: CVPR
Sünderhauf N, Neubert P, Protzel P (2013) Are we there yet? challenging SeqSLAM on a 3000 km journey across all four seasons. In: ICRA workshop on long-term autonomy
Torii A, Arandjelovic R, Sivic J, Okutomi M, Pajdla T (2015) 24/7 place recognition by view synthesis. In: CVPR
Tran NT, Le Tan DK, Doan AD, Do TT, Bui TA, Tan M, Cheung NM (2019) On-device scalable image-based localization via prioritized cascade search and fast one-many ransac. TIP 28:1675
MathSciNet Google Scholar
Tremblay J, Prakash A, Acuna D, Brophy M, Jampani V, Anil C, To T, Cameracci E, Boochoon S, Birchfield S (2018) Training deep networks with synthetic data: bridging the reality gap by domain randomization. In: CVPR workshop on autonomous driving
Walch F, Hazirbas C, Leal-Taixe L, Sattler T, Hilsenbeck S, Cremers D (2017) Image-based localization using LSTMS for structured feature correlation. In: ICCV
Wang P, Huang X, Cheng X, Zhou D, Geng Q, Yang R (2019) The ApolloScape open dataset for autonomous driving and its application. TPAMI 42:2702–2719
Google Scholar
Whitley D (1994) A genetic algorithm tutorial. Stat Comput 4:65
Article Google Scholar
Wolf, J., Burgard, W., Burkhardt, H.: Robust vision-based localization for mobile robots using an image retrieval system based on invariant features. In: ICRA (2002)
Wolf J, Burgard W, Burkhardt H (2005) Robust vision-based localization by combining an image-retrieval system with Monte Carlo localization. IEEE Trans Robotics 21:208
Article Google Scholar
Yu K, Zhang T (2010) Improved local coordinate coding using local tangents. In: ICML

Download references

Author information

Authors and Affiliations

School of Computer Science, The University of Adelaide, Adelaide, Australia
Anh-Dzung Doan, Yasir Latif, Tat-Jun Chin, Yu Liu, Shin-Fang Ch’ng & Ian Reid
Department of Computer Science, University of Liverpool, Liverpool, UK
Thanh-Toan Do

Authors

Anh-Dzung Doan
View author publications
You can also search for this author in PubMed Google Scholar
Yasir Latif
View author publications
You can also search for this author in PubMed Google Scholar
Tat-Jun Chin
View author publications
You can also search for this author in PubMed Google Scholar
Yu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shin-Fang Ch’ng
View author publications
You can also search for this author in PubMed Google Scholar
Thanh-Toan Do
View author publications
You can also search for this author in PubMed Google Scholar
Ian Reid
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anh-Dzung Doan.

Ethics declarations

Conflict of interest

We declare that there is no potential conflict of interest for this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Doan, AD., Latif, Y., Chin, TJ. et al. Visual localization under appearance change: filtering approaches. Neural Comput & Applic 33, 7325–7338 (2021). https://doi.org/10.1007/s00521-020-05339-y

Download citation

Received: 17 February 2020
Accepted: 02 September 2020
Published: 17 September 2020
Issue Date: July 2021
DOI: https://doi.org/10.1007/s00521-020-05339-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual localization under appearance change: filtering approaches

Abstract

Access this article

Similar content being viewed by others

Visual-Based Positioning and Pose Estimation

Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities

VPR-Bench: An Open-Source Visual Place Recognition Evaluation Framework with Quantifiable Viewpoint and Appearance Change

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Visual localization under appearance change: filtering approaches

Abstract

Access this article

Similar content being viewed by others

Visual-Based Positioning and Pose Estimation

Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities

VPR-Bench: An Open-Source Visual Place Recognition Evaluation Framework with Quantifiable Viewpoint and Appearance Change

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation