Skip to main content
Log in

Learning occlusion-aware view synthesis for light fields

  • Short paper
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

We present a novel learning-based approach to synthesize new views of a light field image. In particular, given the four corner views of a light field, the presented method estimates any in-between view. We use three sequential convolutional neural networks for feature extraction, scene geometry estimation and view selection. Compared to state-of-the-art approaches, in order to handle occlusions we propose to estimate a different disparity map per view. Jointly with the view selection network, this strategy shows to be the most important to have proper reconstructions near object boundaries. Ablation studies and comparison against the state of the art on Lytro light fields show the superior performance of the proposed method. Furthermore, the method is adapted and tested on light fields with wide baselines acquired with a camera array and, in spite of having to deal with large occluded areas, the proposed approach yields very promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

  1. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al (2016) TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX symposium on operating systems design and implementation, vol 16, pp 265–283

  2. Alperovich A, Johannsen O, Strecke M, Goldluecke B (2018) Light field intrinsics with a deep encoder-decoder network, p 15

  3. Burger HC, Schuler CJ, Harmeling S (2012) Image denoising: can plain neural networks compete with bm3d? In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2392–2399. IEEE

  4. Chang JR, Chen YS (2018) Pyramid stereo matching network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5410–5418

  5. Chaurasia G, Duchene S, Sorkine-Hornung O, Drettakis G (2013) Depth synthesis and local warps for plausible image-based navigation. ACM Trans Gr 32(3):30

    Article  Google Scholar 

  6. Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of the international conference on learning representations

  7. Dabala L, Ziegler M, Didyk P, Zilly F, Keinert J, Myszkowski K, Seidel HP, Rokita P, Ritschel T (2016) Efficient multi-image correspondences for on-line light field video processing. In: Computer graphics forum, vol 35, pp 401–410. Wiley, New York

  8. Dong C, Loy CC, He K, Tang X (2014) Learning a deep convolutional network for image super-resolution. In: Proceedings of the European conference on computer vision, pp 184–199. Springer

  9. Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, van der Smagt P, Cremers D, Brox T (2015) Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766

  10. Fitzgibbon A, Wexler Y, Zisserman A (2005) Image-based rendering using image-based priors. Int J Comput Vis 63(2):141–151

    Article  Google Scholar 

  11. Flynn J, Broxton M, Debevec P, DuVall M, Fyffe G, Overbeck R, Snavely N, Tucker R (2019) Deepview: view synthesis with learned gradient descent. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2367–2376

  12. Flynn J, Neulander I, Philbin J, Snavely N (2016) Deepstereo: learning to predict new views from the world’s imagery. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5515–5524

  13. Fourure D, Emonet R, Fromont E, Muselet D, Tremeau A, Wolf C (2017) Residual conv-deconv grid network for semantic segmentation. In: Proceedings of the British machine vision conference

  14. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th international conference on artificial intelligence and statistics, pp 249–256

  15. Goesele M, Ackermann J, Fuhrmann S, Haubold C, Klowsky R, Steedly D, Szeliski R (2010) Ambient point clouds for view interpolation. In: ACM transactions on graphics, vol 29. ACM, p 95

  16. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  17. Hog M, Sabater N, Guillemot C (2019) Long short term memory networks for light field view synthesis. In: Proceedings of the IEEE international conference on image processing. IEEE, pp 724–728

  18. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the international conference on machine learning

  19. Jeon HG, Park J, Choe G, Park J, Bok Y, Tai YW, So Kweon I (2015) Accurate depth map estimation from a lenslet light field camera. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1547–1555

  20. Jiang H, Sun D, Jampani V, Yang MH, Learned-Miller E, Kautz J (2018) Super SloMo: high quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  21. Kalantari NK, Wang TC, Ramamoorthi R (2016) Learning-based view synthesis for light field cameras. ACM Trans Gr 35(6):193

    Article  Google Scholar 

  22. Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A, Bry A (2017) End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the IEEE international conference on computer vision

  23. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of the international conference for learning representations

  24. Liu Z, Yeh R, Tang X, Liu Y, Agarwala A (2017) Video frame synthesis using deep voxel flow. In: Proceedings of the international conference on computer vision, vol 2

  25. Mildenhall B, Srinivasan PP, Ortiz-Cayon R, Kalantari NK, Ramamoorthi R, Ng R, Kar A (2019) Local light field fusion: practical view synthesis with prescriptive sampling guidelines. ACM Trans Gr 38(4):1–14

    Article  Google Scholar 

  26. Ng R (2006) Digital light field photography. Ph.D. thesis, Stanford, CA, USA. www.lytro.com. AAI3219345, Thesis led to commercial light field camera (Lytro camera)

  27. Niklaus S, Liu F (2018) Context-aware synthesis for video frame interpolation. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  28. Overbeck RS, Erickson D, Evangelakos D, Pharr M, Debevec P (2018) A system for acquiring, processing, and rendering panoramic light field stills for virtual reality. ACM Trans Gr 37(6):1–15

    Article  Google Scholar 

  29. Penner E, Zhang L (2017) Soft 3D reconstruction for view synthesis. ACM Trans Gr 36(6):235

    Article  Google Scholar 

  30. Sabater N, Boisson G, Vandame B, Kerbiriou P, Babon F, Hog M, Gendrot R, Langlois T, Bureller O, Schubert A, Allié V (2017) Dataset and pipeline for multi-view lightfield video. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. IEEE, pp 1743–1753

  31. Shechtman E, Rav-Acha A, Irani M, Seitz S (2010) Regenerative morphing. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 615–622

  32. Shi L, Hassanieh H, Davis A, Katabi D, Durand F (2014) Light field reconstruction using sparsity in the continuous Fourier domain. ACM Trans Gr 34(1):12

    Article  Google Scholar 

  33. Shin C, Jeon HG, Yoon Y, Kweon IS, Kim SJ (2018) EPINET: A fully-convolutional neural network using epipolar geometry for depth from light field images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4748–4757

  34. Srinivasan PP, Tucker R, Barron JT, Ramamoorthi R, Ng R, Snavely N (2019) Pushing the boundaries of view extrapolation with multiplane images. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  35. Srinivasan PP, Wang T, Sreelal A, Ramamoorthi R, Ng R (2017) Learning to synthesize a 4d rgbd light field from a single image. In: Proceedings of the international conference on computer vision, vol 2, p 6

  36. van Amersfoort J, Shi W, Acosta A, Massa F, Totz J, Wang Z, Caballero J (2017) Frame interpolation with multi-scale deep loss functions and generative adversarial networks. arXiv preprint arXiv:1711.06045

  37. Volker T, Boisson G, Chupeau B (2020) Learning light field synthesis with multi-plane images: scene encoding as a recurrent segmentation task. In: Proceedings of the IEEE international conference on image processing

  38. Wang TC, Efros AA, Ramamoorthi R (2015) Occlusion-aware depth estimation using light-field cameras. In: Proceedings of the IEEE international conference on computer vision, pp 3487–3495

  39. Wang Y, Liu F, Wang Z, Hou G, Sun Z, Tan T (2018) End-to-end view synthesis for light field imaging with pseudo 4DCNN. In: Proceedings of the European conference on computer vision

  40. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612

    Article  Google Scholar 

  41. Wanner S, Goldluecke B (2014) Variational light field analysis for disparity estimation and super-resolution. IEEE Trans Pattern Anal Mach Intell 36(3):606–619

    Article  Google Scholar 

  42. Wu G, Zhao M, Wang L, Dai Q, Chai T, Liu Y (2017) Light field reconstruction using deep convolutional network on EPI. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 2017, p 2

  43. Yoon Y, Jeon HG, Yoo D, Lee JY, So Kweon I (2015) Learning a deep convolutional network for light-field image super-resolution. In: Proceedings of the IEEE international conference on computer vision workshops, pp 24–32

  44. Zhang Z, Liu Y, Dai Q (2015) Light field from micro-baseline image pair. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3800–3809

  45. Zhou T, Tucker R, Flynn J, Fyffe G, Snavely N (2018) Stereo magnification: Learning view synthesis using multiplane images. ACM Trans Gr 37(4):1–12

    Google Scholar 

Download references

Acknowledgements

J. Navarro acknowledges support from Ministero de Economía y Competitividad of the Spanish Government under Grant TIN2017-85572-P (MINECO/AEI/FEDER, UE).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Navarro.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Navarro, J., Sabater, N. Learning occlusion-aware view synthesis for light fields. Pattern Anal Applic 24, 1319–1334 (2021). https://doi.org/10.1007/s10044-021-00956-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-021-00956-2

Keywords

Navigation