Learning occlusion-aware view synthesis for light fields

Navarro, J.; Sabater, N.

doi:10.1007/s10044-021-00956-2

Learning occlusion-aware view synthesis for light fields

Short paper
Published: 11 February 2021

Volume 24, pages 1319–1334, (2021)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

461 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

We present a novel learning-based approach to synthesize new views of a light field image. In particular, given the four corner views of a light field, the presented method estimates any in-between view. We use three sequential convolutional neural networks for feature extraction, scene geometry estimation and view selection. Compared to state-of-the-art approaches, in order to handle occlusions we propose to estimate a different disparity map per view. Jointly with the view selection network, this strategy shows to be the most important to have proper reconstructions near object boundaries. Ablation studies and comparison against the state of the art on Lytro light fields show the superior performance of the proposed method. Furthermore, the method is adapted and tested on light fields with wide baselines acquired with a camera array and, in spite of having to deal with large occluded areas, the proposed approach yields very promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al (2016) TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX symposium on operating systems design and implementation, vol 16, pp 265–283
Alperovich A, Johannsen O, Strecke M, Goldluecke B (2018) Light field intrinsics with a deep encoder-decoder network, p 15
Burger HC, Schuler CJ, Harmeling S (2012) Image denoising: can plain neural networks compete with bm3d? In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2392–2399. IEEE
Chang JR, Chen YS (2018) Pyramid stereo matching network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5410–5418
Chaurasia G, Duchene S, Sorkine-Hornung O, Drettakis G (2013) Depth synthesis and local warps for plausible image-based navigation. ACM Trans Gr 32(3):30
Article Google Scholar
Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of the international conference on learning representations
Dabala L, Ziegler M, Didyk P, Zilly F, Keinert J, Myszkowski K, Seidel HP, Rokita P, Ritschel T (2016) Efficient multi-image correspondences for on-line light field video processing. In: Computer graphics forum, vol 35, pp 401–410. Wiley, New York
Dong C, Loy CC, He K, Tang X (2014) Learning a deep convolutional network for image super-resolution. In: Proceedings of the European conference on computer vision, pp 184–199. Springer
Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, van der Smagt P, Cremers D, Brox T (2015) Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766
Fitzgibbon A, Wexler Y, Zisserman A (2005) Image-based rendering using image-based priors. Int J Comput Vis 63(2):141–151
Article Google Scholar
Flynn J, Broxton M, Debevec P, DuVall M, Fyffe G, Overbeck R, Snavely N, Tucker R (2019) Deepview: view synthesis with learned gradient descent. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2367–2376
Flynn J, Neulander I, Philbin J, Snavely N (2016) Deepstereo: learning to predict new views from the world’s imagery. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5515–5524
Fourure D, Emonet R, Fromont E, Muselet D, Tremeau A, Wolf C (2017) Residual conv-deconv grid network for semantic segmentation. In: Proceedings of the British machine vision conference
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th international conference on artificial intelligence and statistics, pp 249–256
Goesele M, Ackermann J, Fuhrmann S, Haubold C, Klowsky R, Steedly D, Szeliski R (2010) Ambient point clouds for view interpolation. In: ACM transactions on graphics, vol 29. ACM, p 95
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hog M, Sabater N, Guillemot C (2019) Long short term memory networks for light field view synthesis. In: Proceedings of the IEEE international conference on image processing. IEEE, pp 724–728
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the international conference on machine learning
Jeon HG, Park J, Choe G, Park J, Bok Y, Tai YW, So Kweon I (2015) Accurate depth map estimation from a lenslet light field camera. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1547–1555
Jiang H, Sun D, Jampani V, Yang MH, Learned-Miller E, Kautz J (2018) Super SloMo: high quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Kalantari NK, Wang TC, Ramamoorthi R (2016) Learning-based view synthesis for light field cameras. ACM Trans Gr 35(6):193
Article Google Scholar
Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A, Bry A (2017) End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the IEEE international conference on computer vision
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of the international conference for learning representations
Liu Z, Yeh R, Tang X, Liu Y, Agarwala A (2017) Video frame synthesis using deep voxel flow. In: Proceedings of the international conference on computer vision, vol 2
Mildenhall B, Srinivasan PP, Ortiz-Cayon R, Kalantari NK, Ramamoorthi R, Ng R, Kar A (2019) Local light field fusion: practical view synthesis with prescriptive sampling guidelines. ACM Trans Gr 38(4):1–14
Article Google Scholar
Ng R (2006) Digital light field photography. Ph.D. thesis, Stanford, CA, USA. www.lytro.com. AAI3219345, Thesis led to commercial light field camera (Lytro camera)
Niklaus S, Liu F (2018) Context-aware synthesis for video frame interpolation. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Overbeck RS, Erickson D, Evangelakos D, Pharr M, Debevec P (2018) A system for acquiring, processing, and rendering panoramic light field stills for virtual reality. ACM Trans Gr 37(6):1–15
Article Google Scholar
Penner E, Zhang L (2017) Soft 3D reconstruction for view synthesis. ACM Trans Gr 36(6):235
Article Google Scholar
Sabater N, Boisson G, Vandame B, Kerbiriou P, Babon F, Hog M, Gendrot R, Langlois T, Bureller O, Schubert A, Allié V (2017) Dataset and pipeline for multi-view lightfield video. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. IEEE, pp 1743–1753
Shechtman E, Rav-Acha A, Irani M, Seitz S (2010) Regenerative morphing. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 615–622
Shi L, Hassanieh H, Davis A, Katabi D, Durand F (2014) Light field reconstruction using sparsity in the continuous Fourier domain. ACM Trans Gr 34(1):12
Article Google Scholar
Shin C, Jeon HG, Yoon Y, Kweon IS, Kim SJ (2018) EPINET: A fully-convolutional neural network using epipolar geometry for depth from light field images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4748–4757
Srinivasan PP, Tucker R, Barron JT, Ramamoorthi R, Ng R, Snavely N (2019) Pushing the boundaries of view extrapolation with multiplane images. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Srinivasan PP, Wang T, Sreelal A, Ramamoorthi R, Ng R (2017) Learning to synthesize a 4d rgbd light field from a single image. In: Proceedings of the international conference on computer vision, vol 2, p 6
van Amersfoort J, Shi W, Acosta A, Massa F, Totz J, Wang Z, Caballero J (2017) Frame interpolation with multi-scale deep loss functions and generative adversarial networks. arXiv preprint arXiv:1711.06045
Volker T, Boisson G, Chupeau B (2020) Learning light field synthesis with multi-plane images: scene encoding as a recurrent segmentation task. In: Proceedings of the IEEE international conference on image processing
Wang TC, Efros AA, Ramamoorthi R (2015) Occlusion-aware depth estimation using light-field cameras. In: Proceedings of the IEEE international conference on computer vision, pp 3487–3495
Wang Y, Liu F, Wang Z, Hou G, Sun Z, Tan T (2018) End-to-end view synthesis for light field imaging with pseudo 4DCNN. In: Proceedings of the European conference on computer vision
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Article Google Scholar
Wanner S, Goldluecke B (2014) Variational light field analysis for disparity estimation and super-resolution. IEEE Trans Pattern Anal Mach Intell 36(3):606–619
Article Google Scholar
Wu G, Zhao M, Wang L, Dai Q, Chai T, Liu Y (2017) Light field reconstruction using deep convolutional network on EPI. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 2017, p 2
Yoon Y, Jeon HG, Yoo D, Lee JY, So Kweon I (2015) Learning a deep convolutional network for light-field image super-resolution. In: Proceedings of the IEEE international conference on computer vision workshops, pp 24–32
Zhang Z, Liu Y, Dai Q (2015) Light field from micro-baseline image pair. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3800–3809
Zhou T, Tucker R, Flynn J, Fyffe G, Snavely N (2018) Stereo magnification: Learning view synthesis using multiplane images. ACM Trans Gr 37(4):1–12
Google Scholar

Download references

Acknowledgements

J. Navarro acknowledges support from Ministero de Economía y Competitividad of the Spanish Government under Grant TIN2017-85572-P (MINECO/AEI/FEDER, UE).

Author information

Authors and Affiliations

DMI-IAC3, Universitat de les Illes Balears, Palma, Spain
J. Navarro
InterDigital R&I, Cesson-Sévigné, France
N. Sabater

Authors

J. Navarro
View author publications
You can also search for this author in PubMed Google Scholar
N. Sabater
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. Navarro.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Navarro, J., Sabater, N. Learning occlusion-aware view synthesis for light fields. Pattern Anal Applic 24, 1319–1334 (2021). https://doi.org/10.1007/s10044-021-00956-2

Download citation

Received: 09 December 2019
Accepted: 24 January 2021
Published: 11 February 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s10044-021-00956-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning occlusion-aware view synthesis for light fields

Abstract

Access this article

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation