Enhancing collaborative road scene reconstruction with unsupervised domain alignment

Venator, Moritz; Aklanoglu, Selcuk; Bruns, Erich; Maier, Andreas

doi:10.1007/s00138-020-01144-8

Enhancing collaborative road scene reconstruction with unsupervised domain alignment

Original Paper
Published: 03 November 2020

Volume 32, article number 13, (2021)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Moritz Venator ORCID: orcid.org/0000-0001-9869-4968^1,2,
Selcuk Aklanoglu¹,
Erich Bruns¹ &
…
Andreas Maier²

448 Accesses
5 Citations
Explore all metrics

Abstract

Scene reconstruction and visual localization in dynamic environments such as street scenes are a challenge due to the lack of distinctive, stable keypoints. While learned convolutional features have proven to be robust to changes in viewing conditions, handcrafted features still have advantages in distinctiveness and accuracy when applied to structure from motion. For collaborative reconstruction of road sections by a car fleet, we propose to use multimodal domain adaptation as a preprocessing step to align images in their appearance and enhance keypoint matching across viewing conditions while preserving the advantages of handcrafted features. Training a generative adversarial network for translations between different illumination and weather conditions, we evaluate qualitative and quantitative aspects of domain adaptation and its impact on feature correspondences. Combined with a multi-feature discriminator, the model is optimized for synthesis of images which do not only improve feature matching but also exhibit a high visual quality. Experiments with a challenging multi-domain dataset recorded in various road scenes on multiple test drives show that our approach outperforms other traditional and learning-based methods by improving completeness or accuracy of structure from motion with multimodal two-domain image collections in eight out of ten test scenes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 5

Fig. 8

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Image Fusion Techniques: A Survey

Article 24 January 2021

References

Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building Rome in a day. In: IEEE International Conference on Computer Vision (ICCV) (2009)
Anoosheh, A., Sattler, T., Timofte, R., Pollefeys, M., van Gool, L.: Night-to-day image translation for retrieval-based localization. In: IEEE International Conference on Robotics and Automation (ICRA) (2019)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning (ICML) (2017)
Bay, H., Ess, A., Tuytelaars, T., van Gool, L.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Article Google Scholar
Corke, P., Paul, R., Churchill, W., Newman, P.: Dealing with shadows: capturing intrinsic scene appearance for image-based outdoor localisation. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2013)
Crandall, D.J., Owens, A., Snavely, N., Huttenlocher, D.P.: SfM with MRFs: discrete-continuous optimization for large-scale structure from motion. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 35(12), 2841–2853 (2013)
Article Google Scholar
Cui, H., Shen, S., Gao, W., Hu, Z.: Efficient large-scale structure from motion by fusing auxiliary imaging information. IEEE Trans. Image Process. 24(11), 3561–3573 (2015)
Article MathSciNet Google Scholar
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: real-time single camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 29(6), 1052–1067 (2007)
Article Google Scholar
DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperPoint: self-supervised interest point detection and description. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (2018)
Dong, J., Soatto, S.: Domain-size pooling in local descriptors: DSP-SIFT. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: Advances in Neural Information Processing Systems (NIPS) (2016)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
Gaiani, M., Remondino, F., Apollonio, F., Ballabeni, A.: An advanced pre-processing pipeline to improve automated photogrammetric reconstructions of architectural scenes. Remote Sens. 8(3), 178 (2016)
Article Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems (NIPS) (2014)
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
MATH Google Scholar
Huang, X., Belongie, S.J.: Arbitrary style transfer in real-time with adaptive instance normalization. In: IEEE International Conference on Computer Vision (ICCV) (2017)
Huang, X., Liu, M.Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: European Conference on Computer Vision (ECCV) (2018)
Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., van Gool, L.: WESPE: Weakly supervised photo enhancer for digital cameras. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (2018)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Johnson, J., Alahi, A., Li, F.F.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision (ECCV) (2016)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: International Conference on Learning Representations (ICLR) (2018)
Kazemi, H., Iranmanesh, S.M., Nasrabadi, N.M.: Style and content disentanglement in generative adversarial networks. In: IEEE Winter Conference on Applications of Computer Vision (WACV) (2019)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization (2014)
Klingner, B., Martin, D., Roseborough, J.: Street view motion-from-structure-from-motion. In: IEEE International Conference on Computer Vision (ICCV) (2013)
Larsen, A.B.L., Sonderby, S.K., Winther, O.: Autoencoding beyond pixels using a learned similarity metric. In: International Conference on Machine Learning (ICML) (2016)
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., Shi, W.: Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M., Yang, M.H.: Diverse image-to-image translation via disentangled representations. In: European Conference on Computer Vision (ECCV) (2018)
Leutenegger, S., Chli, M., Siegwart, R.: BRISK: Binary robust invariant scalable keypoints. In: International Conference on Computer Vision (ICCV) (2011)
Lhuillier, M.: Fusion of GPS and structure-from-motion using constrained bundle adjustments. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: Advances in Neural Information Processing Systems (NIPS) (2017)
Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. In: Advances in Neural Information Processing Systems (NIPS) (2016)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Maddern, W., Pascoe, G., Linegar, C., Newman, P.: 1 Year, 1000 km: the Oxford RobotCar dataset. Int. J. Robot. Res. 36(1), 3–15 (2017)
Article Google Scholar
Mao, X., Li, Q., Xie, H., Lau, R.Y.K., Wang, Z.: Multi-class Generative Adversarial Networks with the L2 Loss Function. arXiv:1611.04076 (2016)
Mirza, M., Osindero, S.: Conditional Generative Adversarial Nets. arXiv:1411.1784 (2014)
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)
Article Google Scholar
Naseer, T., Oliveira, G.L., Brox, T., Burgard, W.: Semantics-aware visual localization under challenging perceptual conditions. In: IEEE International Conference on Robotics and Automation (ICRA) (2017)
Porav, H., Maddern, W., Newman, P.: Adversarial training for adverse conditions: robust metric localisation using appearance transfer. In: IEEE International Conference on Robotics and Automation (ICRA) (2018)
Radford, A., Metz, L., Chintala, S.: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv:1511.06434 (2015)
Riazuelo, L., Civera, J., Montiel, J.: C2TAM: A cloud framework for cooperative tracking and mapping. Robot. Auton. Syst. 62(4), 401–413 (2014)
Article Google Scholar
Rosca, M., Lakshminarayanan, B., Warde-Farley, D., Mohamed, S.: Variational Approaches for Auto-Encoding Generative Adversarial Networks. arXiv:1706.04987 (2017)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. In: International Conference on Computer Vision (ICCV) (2011)
Sattler, T., Maddern, W., Toft, C., Torii, A., Hammarstrand, L., Stenborg, E., Safari, D., Okutomi, M., Pollefeys, M., Sivic, J., Kahl, F., Pajdla, T.: Benchmarking 6DOF outdoor visual localization in changing conditions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Schiffers, F., Yu, Z., Arguin, S., Maier, A., Ren, Q.: Synthetic fundus fluorescein angiography using deep neural networks. In: Bildverarbeitung für die Medizin. Springer Vieweg (2018)
Schoenberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Schönberger, J.L., Hardmeier, H., Sattler, T., Pollefeys, M.: Comparative Evaluation of Hand-Crafted and Learned Local Features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 [cs.CV] (2014)
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3D. ACM Trans. Graph. 25(3), 835–846 (2006)
Article Google Scholar
Sünderhauf, N., Shirazi, S., Jacobson, A., Dayoub, F., Pepperell, E., Upcroft, B., Milford, M.: Place recognition with ConvNet landmarks: viewpoint-robust, condition-robust, training-free. In: Robotics: Science and Systems (RSS) (2015)
Venator, M., Bruns, E., Maier, A.: Robust camera pose estimation for unordered road scene images in varying viewing conditions. IEEE Trans. Intell. Veh. 5(1), 165–174 (2019)
Article Google Scholar
Wallis, R.H.: An approach for the space variant restoration and enhancement of images. In: Symposium on Current Mathematical Problems in Image Science (1976)
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Wang, X., Gupta, A.: Generative image modeling using style and structure adversarial networks. In: European Conference on Computer Vision (ECCV) (2016)
Widya, A.R., Torii, A., Okutomi, M.: Structure-from-motion using dense CNN features with keypoint relocalization. IPSJ Trans. Comput. Vis. Appl. 10(1) (2018)
Wu, C.: Towards linear-time incremental structure from motion. In: International Conference on 3D Vision (3DV) (2013)
Yi, K.M., Trulls, E., Lepetit, V., Fua, P.: LIFT: Learned invariant feature transform. In: European Conference on Computer Vision (ECCV) (2016)
Ying, Z., Li, G., Ren, Y., Wang, R., Wang, W.: A new image contrast enhancement algorithm using exposure fusion framework. In: International Conference on Computer Analysis of Images and Patterns (CAIP) (2017)
Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., Metaxas, D.N.: StackGAN++: Realistic image synthesis with stacked generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 41(8), 1947–1962 (2019)
Article Google Scholar
Zhao, J.J., Mathieu, M., LeCun, Y.: Energy-based Generative Adversarial Network. arXiv:1609.03126 (2016)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision (ICCV) (2017)
Zou, D., Tan, P.: CoSLAM: Collaborative visual SLAM in dynamic environments. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 35(2), 354–366 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Camera Systems Development for Automated Driving, AUDI AG, 85045, Ingolstadt, Germany
Moritz Venator, Selcuk Aklanoglu & Erich Bruns
Pattern Recognition Lab, Department of Computer Science 5, Friedrich-Alexander University Erlangen-Nürnberg (FAU), 91058, Erlangen, Germany
Moritz Venator & Andreas Maier

Authors

Moritz Venator
View author publications
You can also search for this author in PubMed Google Scholar
Selcuk Aklanoglu
View author publications
You can also search for this author in PubMed Google Scholar
Erich Bruns
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Maier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moritz Venator.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 30871 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Venator, M., Aklanoglu, S., Bruns, E. et al. Enhancing collaborative road scene reconstruction with unsupervised domain alignment. Machine Vision and Applications 32, 13 (2021). https://doi.org/10.1007/s00138-020-01144-8

Download citation

Received: 26 November 2019
Revised: 10 June 2020
Accepted: 15 October 2020
Published: 03 November 2020
DOI: https://doi.org/10.1007/s00138-020-01144-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing collaborative road scene reconstruction with unsupervised domain alignment

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Image Matching from Handcrafted to Deep Features: A Survey

Image Fusion Techniques: A Survey

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enhancing collaborative road scene reconstruction with unsupervised domain alignment

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Image Matching from Handcrafted to Deep Features: A Survey

Image Fusion Techniques: A Survey

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation