Compressively sampled light field reconstruction using orthogonal frequency selection and refinement

https://doi.org/10.1016/j.image.2020.116087Get rights and content

Highlights

  • Light field acquisition devices suffer from a spatio-angular trade-off.

  • Compressive reconstruction is possible from a small number of spatial samples.

  • The 4D Fourier spectrum of a light field is sparse.

  • Sparsity of the light field is better preserved in the continuous Fourier domain.

Abstract

This paper considers the compressive sensing framework as a way of overcoming the spatio-angular trade-off inherent to light field acquisition devices. We present a novel method to reconstruct a full 4D light field from a sparse set of data samples or measurements. The approach relies on the assumption that sparse models in the 4D Fourier domain can efficiently represent light fields. The proposed algorithm reconstructs light fields by selecting the frequencies of the Fourier basis functions that best approximate the available samples in 4D hyper-blocks. The performance of the reconstruction algorithm is further improved by enforcing orthogonality of the approximation residue at each iteration, i.e. for each selected basis function. Since sparsity is better preserved in the continuous Fourier domain, we propose to refine the selected frequencies by searching for neighboring non-integer frequency values. Experiments show that the proposed algorithm yields performance improvements of more than 1 dB compared to state-of-the-art compressive light field reconstruction methods. The frequency refinement step also significantly enhances the visual quality of reconstruction results of our method by a 1.8 dB average.

Introduction

Light field imaging has acquired a significant interest over the last two decades, both in research and industry. Providing a rich representation of the captured scene, light fields bring a variety of novel post-capture applications and enable immersive experiences. Several camera setups have been proposed for light field acquisition. Plenoptic cameras use an array of micro-lenses placed in front of the photosensor that provides an additional angular information at the expense of a decreased spatial resolution [1], [2]. Plenoptic consumer cameras, such as Lytro’s, have become widely available, then SLR cameras and smartphones started to feature dual-pixel or quad-pixel sensors. Such light fields do not provide much angular variation because of their limited disparities, but paved the way for more ambitious applications. On the other hand, light fields can also be captured by arrays of cameras [3] usually dedicated to professional applications. Camera arrays offer both higher spatial resolution and wider parallax than plenoptic cameras. This enables accurate depth estimation, which is required for virtual reality applications and cinema production. Note that each new generation of smartphones is equipped with an increasing number of cameras, which foreshadows light field imaging in the future for mobile devices [4]. Another approach for capturing light fields consists in moving a single camera (e.g. the Stanford Gantry1 ), in front of a static scene. That kind of setup excludes the acquisition of video light fields, but provides data with high spatial and angular resolutions. Besides, at the research level, some alternative solutions have been proposed for light field acquisition with improved resolution [5], [6], [7], or for a more flexible acquisition [8], [9].

While the industry calls for increasing image resolutions, the task of acquiring high-quality 4D light field content remains challenging, due to the complexity and size of optics, photo-sensors and, ultimately, because of the bottleneck of data storage. Indeed, capturing light field videos requires sophisticated systems and engineering proficiency to operate the incoming data stream. When distributed storage is not an option, e.g. for real-time pre-visualization, light field video acquisition setups have no choice but sacrificing the resolution in one or several dimensions: spatial, angular, or temporal. Let us take the example of a grid of 16 4K-cameras acquiring videos at a frame-rate of 30 fps. This corresponds to 4 Gigabytes per second of data to write on the disk, which exceeds the actual SSD (Solid-State Drive) throughput capacities.

In this context, we consider a light field acquisition system that would store only a few measurements of the captured light field content. More precisely, the acquisition system consists of a sensor grid (a grid of cameras), and the captured data is transferred to a computer where a software program extracts data samples to be stored, in order to meet the disk throughput requirements. Each view of the light field is sampled following random sampling patterns which are independent from one view to another. By the sampling operation we retain and store only a sparse set of light field pixels. The sampling mask is assumed to be known by the reconstruction algorithm. Our experimental results show that the proposed random spatial sampling yields better performances compared with an angular sampling which would consist in keeping only a subset of entire views, as e.g. in [10]. In the validation tests, we assume that the light field images have been demosaicked, to make the comparison to state-of-the-art methods possible. Note that the sampling and reconstruction are applied independently within the three color channels (Red, Green and Blue).

This paper describes a light field reconstruction method from a sparse set of randomly selected samples, with no prior sampling pattern specifications. The method, named Orthogonal Frequency Selection (OFS), performs the reconstruction per 4D hyper-block, where a model is iteratively generated to best fit the available data in the hyper-block and its 4D surroundings. The proposed method extends the 2D image Frequency Selective Reconstruction (FSR) approach described in [11] to 4D light fields while further improving it by introducing an orthogonality constraint on the residue. As in our previous work [12], the proposed OFS method exploits the assumption that the light field data is sparse in the Fourier domain [13], meaning that the Fourier transform of the light field can be expressed as a linear combination of a small number of Fourier basis functions. The reconstruction algorithm therefore searches for these bases (i.e. their frequencies) which best represent the 4D Fourier spectrum of the sampled light field.

Besides, since sparsity is better verified in the continuous Fourier domain [14], the method is further extended to reconstruct the light field at non-integer angular frequency positions. This method, called OFS+refinement, allows us to better approximate the Fourier spectrum of the signal, and to overcome the problem of having a small number of samples in the angular direction. Furthermore, analytical forms can be derived in the Fourier domain, and the expansion coefficient computation can be directly done using the Fourier transforms of the signal. Our solutions can be applied to any captured light field, independently from the acquisition system. They are also free of any prior knowledge of scene geometry and do not require any pre-processing step such as depth estimation.

Experimental results with several light fields show that the proposed OFS and OFS+refinement methods yield a high reconstruction quality from a small number of input samples (e.g. at sampling rates going down to 4%). Results with synthetic and real light fields show that it outperforms reference methods that either similarly pose the problem in a compressive sensing framework as [15] where the authors use a union of trained dictionaries, or exploit sparsity in the 4D Fourier domain as [14]. Given that storing a subset of views rather than randomly sampling all the views could also be considered to address the data rate issue, we also compare the proposed OFS algorithm with the view synthesis method of [10] which uses deep neural networks.

This paper is organized as follows. Section 2 gives an overview of the related work. Section 3 gives a detailed description of the proposed reconstruction method for 4D light fields. The non-integer frequency refinement is also described in this section. The experimental results with the proposed algorithms are given in Section 4. Section 5 concludes the whole paper.

Section snippets

Related work

In this section, we present an overview of the topics related to our work. We first give a brief review of state-of-the-art light field imaging solutions. Then, we present existing compressive sensing systems for higher-resolution light field acquisition. We finally introduce sparse light field reconstruction methods in the Fourier domain.

Problem statement

Let L(x,y,u,v) denote a given 4D light field, that we assume to be sparse in the 4D Fourier domain. Hence, the light field L can be represented by a sparse vector α as L=Ψα,where Ψ is the matrix containing the Fourier basis functions and α being the sparse vector of expansion coefficients. Let Ls(x,y,u,v) be the 4D randomly-sampled version of L, obtained as Ls=ΦL=ΦΨαwhere Φ represents the sub-sampling matrix containing 0 and 1 values to define the available data sample positions. Ls(x,y,u,v)

Experiments

The proposed OFS and OFS+refinement with non-integer frequencies algorithms, as well as FSR [11] extended to 4D here, are compared against three different types of method. We compare the proposed approach with:

  • (1)

    the compressive sensing method based on a union of trained dictionaries of [15], with overlapping and non overlapping patches.

  • (2)

    the light field reconstruction method from a set of views, exploiting sparsity of light fields in the 4D Fourier domain [14].

  • (3)

    the deep learning-based view synthesis

Conclusion

In this paper, we introduced a new iterative block-wise algorithm to compressively reconstruct light field images. We tackle the challenge of capturing high-resolution images of the scene, by storing compressive data and reconstructing the full resolution images using an approximation of the available samples in the Fourier domain. The approximation model is generated by sparsely selecting the Fourier basis functions that best fit the sampled data, while ensuring the orthogonality of the

CRediT authorship contribution statement

Fatma Hawary: Conceptualization, Methodology, Validation, Formal analysis, Writing - original draft, Writing - review & editing, Software, Investigation, Visualization. Guillaume Boisson: Conceptualization, Methodology, Validation, Investigation, Resources, Writing - review & editing, Supervision, Funding acquisition. Christine Guillemot: Conceptualization, Resources, Writing - review & editing, Supervision, Funding acquisition. Philippe Guillotel: Project administration, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (41)

  • ShiL. et al.

    Light field reconstruction using sparsity in the continuous Fourier domain

    ACM Trans. Graph.

    (2014)
  • NgR.

    Digital Light Field Photography

    (2006)
  • GeorgievT. et al.

    Super-resolution with the focused plenoptic camera

    Proc. SPIE

    (2011)
  • WilburnB. et al.

    High performance imaging using large camera arrays

    ACM Trans. Graph. (TOG)

    (2005)
  • VenkataramanK. et al.

    PiCam: An ultra-thin high performance monolithic camera array

    ACM Transactions on Graphics, Proceedings of SIGGRAPH Asia

    (2014)
  • VeeraraghavanA. et al.

    Dappled photography: Mask enhanced cameras for heterodyned light fields and coded aperture refocusing

    ACM Trans. Graph.

    (2007)
  • HirschM. et al.

    A switchable light field camera architecture with angle sensitive pixels and dictionary-based sparse coding

    ICCP

    (2014)
  • IseringhausenJ. et al.

    4D imaging through spray-on optics

    ACM Trans. Graph.

    (2017)
  • BuehlerC. et al.

    Unstructured lumigraph rendering

    ACM SIGGRAPH

    (2001)
  • DavisA. et al.

    Unstructured light fields

  • KalantariN.K. et al.

    Learning-based view synthesis for light field cameras

    ACM Trans. Graph.

    (2016)
  • SeilerJ. et al.

    Resampling images to a regular grid from a non-regular subset of pixel positions using frequency selective reconstruction

    IEEE Trans. Image Process.

    (2015)
  • HawaryF. et al.

    Compressive 4D light field reconstruction using orthogonal frequency selection

    IEEE ICIP

    (2018)
  • HawaryF. et al.

    Scalable light field compression scheme using sparse reconstruction and restoration

    IEEE ICIP

    (2017)
  • MiandjiE. et al.

    Compressive image reconstruction in reduced union of subspaces

    Comput. Graph. Forum

    (2015)
  • LiangC.-K. et al.

    Programmable aperture photography: multiplexed light field acquisition

    Proc. of ACM SIGGRAPH

    (2008)
  • XuZ. et al.

    A high-resolution light field camera with dual-mask design

    Proc. SPIE

    (2012)
  • ZhangZ. et al.

    Light field from micro-baseline image pair

    CVPR

    (2015)
  • YagiY. et al.

    PCA-coded aperture for light field photography

    IEEE ICIP

    (2017)
  • YagiY. et al.

    Designing coded aperture camera based on PCA and NMF for light field acquisition

    IEICE Trans. Inf. Syst.

    (2018)
  • Cited by (2)

    This work has been in part supported by the EU H2020 Research and Innovation Programme under grant agreement No 694122 (ERC advanced grant CLIM).

    View full text