Single image super-resolution via hybrid resolution NSST prediction

doi:10.1016/j.cviu.2021.103202

Computer Vision and Image Understanding

Volume 207, June 2021, 103202

https://doi.org/10.1016/j.cviu.2021.103202 Get rights and content

Highlights

•
This is the first attempt to solve image SR as a NSST coefficients prediction problem.
•
A new deep hybrid resolution network is proposed by a residual-in-residual style.
•
The proposed NSST-based framework can be easily adapted to any arbitrary CNN architectures.
•
Our method achieves more appealing results than previous SISR methods.
•
We show that our proposed solutions are effective for real-world image SR.

Abstract

Convolutional neural networks (CNNs) have achieved great success in single image super-resolution (SR). However, most previous methods predict high-resolution (HR) images in the spatial domain, producing over-smoothed outputs while losing texture details. To address this problem, in this paper we propose to predict nonsubsampled shearlet transform (NSST) coefficients, which better represent the global topology information and local texture details of HR images. On the other hand, we propose a deep hybrid resolution network by a residual-in-residual style, which aggregates features of multiple resolutions so as to gather rich context information in compact representations. When evaluated on a newly released RealSR dataset and traditional simulated datasets, our method, namely hybrid resolution NSST prediction (HRNP), achieves more appealing results, w.r.t. PSNR and SSIM, than the state-of-the-art methods. Moreover, we find our HRNP is more capable of preserving complex edges and curves than other methods.

Introduction

In recent years, the restoration of high-resolution (HR) images from given low-resolution (LR) images has been widely investigated by many researchers in the community of multimedia. This task is defined as super-resolution (SR). Since image SR restores the high-frequency information, it is widely used in many applications, ranging from medical and satellite imaging to visual surveillance, where high-frequency details are greatly desired. In this paper, we tackle the specific problem of single image super-resolution (SISR).

Recent advances in SISR take advantage of the powerful representation ability of convolutional neural networks (CNNs). However, synthesizing LR images as the training data for CNNs is quite challenging as it is difficult to obtain LR–HR image pairs in real-world applications. Therefore, most of existing methods train and evaluate their models on simulated datasets, where the LR images are artificially generated by applying bicubic downsampling to their HR counterparts. Unfortunately, existing models trained on such simulated datasets are hard to generalize to practical applications since the degradation functions in real-world are much more complex. To approximate the realistic degradation function more accurately, the RealSR dataset (Cai et al., 2019) is released, in which the LR images are generated by using digital single lens reflex (DSLR) cameras for reflecting a realistic situation. In this paper, we adopt the RealSR dataset for training and evaluation, allowing our model is capable of the real-world image SR in practical applications.

On the other hand, most concurrent SISR methods (Dong et al., 2016b, Tai et al., 2017b, Zhang et al., 2018, He et al., 2019) use a pixel-wise loss in image space, enforcing the pixel-wise outputs become more and more similar to the ground-truth HR images in the iterative CNN training procedure. However, such approaches usually produce blurry and over-smoothed outputs, loosing texture details. In contrast, some efforts are made to solve the SISR problem in the transform domain, which is able to retain the contextual and textural information of an image at different levels. Instead of learning the mapping from LR to HR images in the spatial domain, these methods formulate the SR problem as transform coefficients prediction. For instance, discrete wavelet transform (DWT) has been explored for SR in both conventional frameworks (Demirel and Anbarjafari, 2011b, Demirel and Anbarjafari, 2011a, Hui and Lam, 2012) and deep networks (Kumar et al., 2017, Guo et al., 2017, Liu et al., 2018, Zhong et al., 2018, Huang et al., 2019, Deng et al., 2019). DWT effectively deals with the “point singularity” problem of one-dimension signal. However, a common limitation of DWT is that it cannot well represent the curves and edges of two-dimensional images because of its isotropic property (Do and Vetterli, 2005). In order to overcome the disadvantages of DWT, we propose to use nonsubsampled shearlet transform (NSST), which consists of the non-subsampled Laplacian pyramid transform and several different shearing filters (Li et al., 2015). NSST provides optimal approximation for a piecewise smooth function, and is also fast to implement. In Fig. 1, we compare the high-frequency coefficients of NSST and DWT, where we can clearly see that NSST represents the curvature more accurately. In this paper, we propose to formulate the SISR problem as the prediction of NSST coefficients, which are able to preserve richer structural information than DWT and to avoid artifacts. The contributions are summarized as follows:

(1) We propose a deep hybrid resolution network (HyRNet) by a residual-in-residual style, which fully exploits multi-resolution features so as to gather rich context information in compact representations, allowing our model to generate more plausible SR results.

(2) A novel NSST and CNN based approach, namely Hybrid Resolution NSST Prediction (HRNP), is proposed for image SR. To the best of our knowledge, this is the first attempt to solve image SR as a NSST coefficients prediction problem. In principle, our NSST prediction can be easily adapted to any arbitrary CNN architectures.

(3) We show that our proposed solutions are effective for real-world image SR via model analysis. When evaluated on the simulated and RealSR datasets, our method achieves more appealing results both quantitatively and qualitatively than previous state-of-the-art methods.

Rest of the paper is organized as follows: Section 2 briefly reviews the related works on CNN-based SISR methods and transform domain based SISR methods, respectively. Section 3 describes the proposed method in detail. Model analysis and comparisons with the state-of-the-art methods are presented in Section 4. Finally, concluding remarks are made in Section 5.

Section snippets

Related work

In this paper, we propose a CNN and NSST based approach for SISR. Therefore, in the following, we will review related works on CNN based methods, and transform domain based methods, respectively.

Method

In this section, we will first give an overview of our proposed method for SISR. After that, we will provide a brief introduction to our proposed HyRNet, followed by the description of NSST.

Experiments

In this section, we evaluate the performance of our method for image SR. We first provide the implementation details about datasets, training settings, and evaluation protocol. After outlining our experimental setup, we will study the contribution of NSST and our HyRNet by the ablation experiments. In the end, we compare our method with several state-of-the-art methods on realSR and simulated datasets.

Conclusions

In this paper, we propose a new approach to handle single image super resolution task. In order to represent the curvatures in a more accurate way, we propose to predict the NSST coefficients for high resolution images instead of the spatial mapping. We also propose a new deep network by residual-in-residual style, to further improve the performance. Experiments on RealSR dataset and simulated datasets demonstrate that our method outperforms the state-of-the-art models for realistic SR. The

CRediT authorship contribution statement

Yunan Liu: Conceptualization, Methodology, Writing - original draft. Shanshan Zhang: Supervision, Resources, Project administration. Chunpeng Wang: Investigation, Writing - review & editing. Jie Xu: Software, Investigation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the Funds for International Cooperation and Exchange of the National Natural Science Foundation of China (Grant No. 61861136011), the Natural Science Foundation of Jiangsu Province, China (Grant No. BK20181299), the National Science Fund of China (Grant Nos. 61702262, 61802212), the Fundamental Research Funds for the Central Universities, China (Grant No. 30920032201), the National Key Research and Development Program of China under Grant 2017YFC0820601, and Shandong

References (50)

EasleyG. et al.
Sparse directional image representation using the discrete shearlet transform
Appl. Comput. Harmon. Anal.
(2008)
HuiZ. et al.
Eigentransformation-based face super-resolution in the wavelet domain
Pattern Recognit. Lett.
(2012)
IzadpanahiS. et al.
Motion based video super resolution using edge directed interpolation and complex wavelet transform
Signal Process.
(2013)
KumarN. et al.
Convolutional neural networks for wavelet domain super resolution
Pattern Recognit. Lett.
(2017)
LiY. et al.
No-reference image quality assessment with shearlet transform and deep neural networks
Neurocomputing
(2015)
WangX.Y et al.
Blind optimum detector for robust image watermarking in nonsubsampled shearlet domain
Inf. Sci.
(2016)
Cai, J., Zeng, H., Yong, H., Cao, Z., Zhang, L., 2019. Toward real-world single image super-resolution: a new benchmark...
ChenL. et al.
Segnet: a deep convolutional encoder-decoder architecture for image segmentation.
IEEE Trans. Pattern Anal. Mach. Intell.
(2017)
Dai, T., Cai, J., Zhang, Y., Tao, S., Zhang, L., 2019. Second-order attention network for single image...
DemirelH. et al.
Discrete wavelet transform-based satellite image resolution enhancement
IEEE Trans. Geosci. Remote Sens.
(2011)

DemirelH. et al.

IMAGE resolution enhancement by using discrete and stationary wavelet decomposition

IEEE Trans. Image Process.

(2011)

Deng, X., Yang, R., Xu, M., Dragotti, P.L., 2019. Wavelet domain style transfer for an effective perception-distortion...

DoM.N. et al.

IMAGE resolution enhancement by using discrete and stationary wavelet decomposition

IEEE Trans. Image Process.

(2005)

Dong, C., Loy, C.C., He, K., Tang, X.O., 2014. Learning a deep convolutional network for image super-resolution. In:...

Dong, C., Loy, C.C., He, K., Tang, X.O., 2016a. Accelerating the super-resolution convolutional neural network. In:...

DongC. et al.

Image super-resolution using deep convolutional networks

IEEE Trans. Pattern Anal. Mach. Intell.

(2016)

Guo, T., Mousavi, H.S., Vu, T.H., Monga, V., 2017. Deep wavelet prediction for image super-resolution. In: IEEE Conf....

Haris, M., Shakhnarovich, G., Ukita, N., 2018. Deep back-projection networks for super-resolution. In: IEEE Conf....

He, X., Mo, Z., Wang, P., Liu, Y., Yang, M., Cheng, J., 2019. ODE-inspired network design for single image...

HuangH.B. et al.

Wavelet domain generative adversarial network for multi-scale face hallucination

Int. J. Comput. Vis.

(2019)

Huang, J.B., Singh, A., Ahuja, N., 2015. Single image super-resolution from transformed self-examplars. In: IEEE Conf....

Hui, Z., Wang, X.M., Gao, X.B., 2018. Fast and accurate single image super-resolution via information distillation...

JiH. et al.

Robust wavelet-based super-resolution reconstruction

IEEE Trans. Pattern Anal. Mach. Intell.

(2008)

Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Grishick, R., Guadarrama, S., Darrell, T., 2014. Caffe:...

Kim, J., Lee, J.K., Lee, K.M., 2016a. Accurate image super-resolution using very deep convolutional networks. In: IEEE...

Cited by (0)

View full text

Single image super-resolution via hybrid resolution NSST prediction

Highlights

Abstract

Introduction

Section snippets

Related work

Method

Experiments

Conclusions

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Appl. Comput. Harmon. Anal.

Pattern Recognit. Lett.

Signal Process.

Pattern Recognit. Lett.

Neurocomputing

Inf. Sci.

Segnet: a deep convolutional encoder-decoder architecture for image segmentation.

IEEE Trans. Pattern Anal. Mach. Intell.

Discrete wavelet transform-based satellite image resolution enhancement

IEEE Trans. Geosci. Remote Sens.

IMAGE resolution enhancement by using discrete and stationary wavelet decomposition

IEEE Trans. Image Process.

IMAGE resolution enhancement by using discrete and stationary wavelet decomposition

IEEE Trans. Image Process.

Image super-resolution using deep convolutional networks

IEEE Trans. Pattern Anal. Mach. Intell.

Wavelet domain generative adversarial network for multi-scale face hallucination

Int. J. Comput. Vis.

Robust wavelet-based super-resolution reconstruction

IEEE Trans. Pattern Anal. Mach. Intell.