1 Introduction

Image denoising as a low-level image processing operator is an important front-end procedure for high-level visual tasks such as object recognition, digital entertainment, and remote sensing imaging. Noise is a random variation of image intensity and appears as grains in the image. It may arise in the image as effects of basic physics-like photon nature of light or thermal energy of heat inside the image sensors. Noise means that the pixels in the image show different intensity values instead of true pixel values. Digital images may be contaminated during acquisition, transmission, and compression [1], by diverse types of noise [2], generated by different causes, such as signal instabilities, defective sensors, physical deterioration of the materiel due to aging, poor lighting conditions, errors in the transmission due to channel noise, or interference caused by electromagnetic fields. Noise suppression is of great interest in digital image processing, considering that the quality improvement of corrupted images is of essential importance for the majority of image processing areas, including analysis of images, detection of edges, and pattern recognition. Mathematically, the problem of image denoising can be modeled as follows:

$$\begin{array}{@{}rcl@{}} y=x+b \end{array} $$
(1)

where y is the observed noisy image, x is the original image, and b represents additive zero-mean white and homogeneous Gaussian noise with standard deviation σ. The goal of image denoising is to recover the original image from its noisy observation where the main challenge is to remove noise while retaining as much as possible the important signal features and improving the PSNR as a common metric used to assess the performance of the denoising methods, the higher the value of PSNR, the more accurate is the denoising.

Owing to solve the clean image x from the (1) is an ill-posed problem, we cannot get the unique solution from the image model with noise. To obtain a good estimation image \(\tilde {x}\), image denoising has been well-studied in the field of image processing over the past several years. Generally, image denoising methods can be classified as [3] spatial domain methods, transform domain methods, which are introduced in more detail in [4].

Wavelet transform (WT) [5] has proved to be effective in noise removal. It decomposes the input signal into multiple scales, which represent different space-frequency components of the original signal. At each scale, some operations, such as thresholding [611] and statistical modeling [1214], can be performed to suppress noise. Denoising is accomplished by transforming back the processed wavelet coefficients into spatial domain. These methods known as wavelet-based denoising techniques can be viewed also as fixed basis dictionaries [1522] to whole images. These wavelet-based methods have demonstrated its efficiency in denoising and have achieved state-of-the-art PSNR performances. However, in the denoising process, these methods use a thresholding technique, by using one of the most popular thresholding functions: the soft-thresholding function and the hard-thresholding function. Consequently, a small threshold retains the noisy wavelet coefficients, and hence, the resultant images may still be noisy whereas a large threshold makes a greater number of wavelet coefficients to zero, which leads to smooth image and image processing may cause blur and artifacts. Therefore, in some of the previous wavelet-based denoising methods based on the thresholding technique, even though the quantitative results are promising, the artifacts in the denoised images are quite noticeable.

In this paper, we address these issues by proposing a new wavelet denoising approach based on unsupervised learning model. In the proposed method, the approximation and wavelet coefficients are denoised by using an adaptive dictionary learned over the set of extracted patches from the wavelet representation of the corrupted image, instead of using the thresholding operator. On the other hand, sparse and redundant representation model is applied to image denoising and has drawn a lot of researchers’ attention. In [23], the K-SVD algorithm was proposed for learning an adaptive dictionary for sparse representation of gray-scale image patches. Inspired by the idea of prior-learning on the corrupted image, the K-SVD algorithm was used to remove white Gaussian noise [24, 25]. This patch-based sparse representation algorithm [26] has shown to outperform in both providing the sparse representation and capability of denoising. Motivated by these ideas and the merits of the wavelet transform, sparsity, multi-resolution structure, and similarity with the human visual system, in this paper, we propose a new image denoising method that takes advantages of the wavelet transform and sparse coding framework.

In our proposed scheme, the prior-learning on the corrupted image is transferred to the wavelet coefficients of it. And the adaptive dictionary has been learned in wavelet domain. It means image denoising in the framework of sparse representations in wavelet domain. Moreover, the proposed model only uses the wavelet decomposition of the noisy image as a training set in the dictionary learning phase, and it does not involve the target output. The model has to learn by its own through determining and adapting according to the structural characteristics in the input patterns. As a result, the proposed work is able to produce a wavelet denoising approach based on unsupervised learning model, by using the K-SVD as a dictionary learning algorithm, and without making any a priori assumption about the data except a parsimony principle.

Experimental results show that the proposed method provides impressive denoising results compared with the wavelet thresholding approach for image denoising, the K-SVD denoising method, and the total variation (TV) denoising method based on the split Bregman technique [27], one of the best denoising models. It also found in the experiments that the proposed method has better performance than the block-matching and 3D filtering (BM3D) method [28] that is often regarded as a state-of-the-art denoising algorithm.

The rest of the paper is organized as follows. Section 2 briefly reviews the wavelet transform. It presents the advantages of wavelet theory and introduces related work. It exposes then and explains the proposed approach. Section 3 addresses the experimental protocol and discusses the obtained results. Finally, we will conclude the paper in Section 4.

2 Methods

2.1 Wavelets and image processing

2.1.1 Wavelet

A wavelet is a function that oscillates like a wave but is quickly attenuated. A wavelet is a function ψ of L2(R) who verifies the following admissibility condition:

$$\begin{array}{@{}rcl@{}} C_{\psi}=\int_{-\infty}^{+\infty} \frac{| \hat{\psi}(\xi)|^{2}}{\left| \xi\right| }d\xi < +\infty \end{array} $$
(2)

2.1.2 Discrete wavelet transform

The discrete wavelet transform is based on the concept of multi-resolution analysis (MRA) introduced by Mallat [29]. The discrete wavelet transform (DWT) of image signals produces a non-redundant image representation, which provides better spatial and spectral localization of image formation, compared with other multi-scale representations such as Gaussian and Laplacian pyramid. Recently, the discrete wavelet transform has attracted more and more interest in image denoising. The DWT can be interpreted as signal decomposition in a set of independent, spatially oriented frequency channels. The signal S is passed through two complementary filters and produces two signals: approximation and details. This is called decomposition or analysis. The components can be assembled back into the original signal without loss of information. This process is called reconstruction or synthesis. The mathematical manipulation, which implies analysis and synthesis, is called a discrete wavelet transform and inverse discrete wavelet transform [30].

2.1.2.1 1D discrete wavelet transform

Every analog signal x(t) with finite energy can be decomposed into a sum of shifted and dilated wavelet functions ψ(t) and shifted scale functions ϕ(t):

$$\begin{array}{@{}rcl@{}} x(t)=\sum_{k=-\infty}^{\infty}c(k)\phi(t-k)+\sum_{j=0}^{\infty}\sum_{k=-\infty}^{\infty}d(j,k)2^{\frac{j}{2}}\psi(2^{j}t-k) \end{array} $$
(3)

where c(k) are scale coefficients and d(j,k) wavelet coefficients. This is a dyadic variant of the DWT. Scale and wavelet coefficients are calculated using scalar products:

$$\begin{array}{@{}rcl@{}} c(k)=\int_{-\infty}^{+\infty}x(t)\phi(t-k)dt \end{array} $$
(4)
$$\begin{array}{@{}rcl@{}} d(j,k)=2^{\frac{j}{2}}\int_{-\infty}^{+\infty}x(t)\psi(2^{j}t-k)dt \end{array} $$
(5)

Hence, filter banks with perfect reconstruction property can be used as a simple realization of the DWT using low-pass and high-pass filters associated, respectively, to the scale function, and the wavelet function [5].

2.1.2.2 2D discrete wavelet transform

Separable 2D discrete wavelet transform is the simplest form of the two-dimensional wavelet generalization. It consists of a standard 1D DWT applied to each row and then to each column as shown in Fig. 1.

Fig. 1
figure 1

Block diagram of wavelet transform

In Fig. 1, if an image has N1 rows and N2 columns, decomposition results in four quarter-size images (N1/2×N2/2): details (LH, HL, HH) and approximation LL. Approximation LL is product of two low-pass filters and provides for an input to the next decomposition level. The reconstruction is performed in the opposite way: first on columns, then on rows. Separable 2D DWT has three wavelet functions (m and n are coordinates of the input image):

$$\begin{array}{@{}rcl@{}} \psi^{1}(m,n)=\phi(m)\psi(n) \hspace{0.25cm} LH wavelet, \end{array} $$
(6)
$$\begin{array}{@{}rcl@{}} \psi^{2}(m,n)=\psi(m)\phi(n) \hspace{0.25cm} HL wavelet, \end{array} $$
(7)
$$\begin{array}{@{}rcl@{}} \psi^{3}(m,n)=\psi(m)\psi(n) \hspace{0.25cm} HH wavelet, \end{array} $$
(8)

and one scale function:

$$\begin{array}{@{}rcl@{}} \phi^{2}(m,n)=\phi(m)\phi(n) \end{array} $$
(9)

associated to the approximation LL [31].

An N level decomposition can be performed resulting in 3N+1 different frequency bands: LL is low frequency or approximation coefficients, and the wavelet image coefficients LH, HL, and HH which correspond, respectively, to vertical high frequencies (horizontal edges), horizontal high frequencies (vertical edges), and high frequencies in both directions (corners), as shown in Fig. 2. In Fig. 2, the number written next to the sub-band name shows the level. The next level of wavelet transform is applied to the low-frequency sub-band image LL only. For more details about the 2D discrete wavelet transform and 2D inverse discrete wavelet transform, the reader can refer to [5] and [3234].

Fig. 2
figure 2

Sub-bands after two levels of wavelet decomposition

2.2 Advantages of wavelet theory

One of the main advantages of wavelets is that they allow complex information such as images to be decomposed into elementary forms at different positions and scales and subsequently reconstructed with high precision [5].

The second main advantage of wavelets is that using fast wavelet transform based on filter banks [5], it is computationally efficient.

Wavelet transform provides sparse representation for a large class of signals [5], and it is capable of revealing aspects of data that other signal analysis techniques miss the aspects like trends, breakdown points, and discontinuities in higher derivatives and self-similarity [35].

Wavelets have the great advantage of being able to capture the energy of a signal in few energy transform values, it does not change the number of pixels required to represent the image and separate the information in a way that resembles the human visual system [36].

2.3 Related work

2.3.1 The K-SVD denoising method

2.3.1.1 Sparse and redundant representations model of signal

Given a signal yRn and a over-completed dictionary D=[d1,d2,⋯,dk]∈Rn×k(n<<k), where di is called atom. The sparse and redundant representations model of signal is equivalent to the following problem:

$$\begin{array}{@{}rcl@{}} (P_{0}): \hspace{0.05cm} \min_{\alpha}\parallel\alpha\parallel_{0} \hspace{0.1cm} s.t.\hspace{0.1cm} y=D\alpha \end{array} $$
(10)

here ∥α0 is called l0 norm. It defines the sparsity measurement of the vector α. The problem (P0) can be solved by the orthogonal matching pursuit (OMP) algorithm [37] because of its simplicity and efficiency.

2.3.1.2 Sparse and redundant representations model for image denoising

We address the classical image denoising problem: a clear image x is corrupted by an additive zero-mean white and homogeneous Gaussian noise z, with standard deviation σ and ∥z2ε, and the observed noisy image y is generated. Hence,

$$\begin{array}{@{}rcl@{}} y=x+z \end{array} $$
(11)

Assume xRn has sparse representation over redundant dictionary, modifying (P0), we get the denoising model as follows:

$$\begin{array}{@{}rcl@{}} (P_{0,\delta}): \hspace{0.05cm} \hat{\alpha}=\arg\min_{\alpha}\parallel\alpha\parallel_{0} \hspace{0.1cm} s.t.\hspace{0.1cm} \parallel y-D\alpha \parallel_{2}^{2}\leq\delta \end{array} $$
(12)

Here δ=δ(ε). And we get the denoised image \(\hat {x}=D\hat {\alpha }\) [38, 39].

2.3.1.3 The K-SVD image denoising method

The key point for solving problem (P0,δ) is to find a suitable redundant dictionary, for this reason, the K-SVD algorithm [23] is proposed. The basic idea is when the training signal and the initial dictionary are given, then the prior-learning idea is used. It learns a dictionary that yields sparse representations for the training signal. The algorithm alternates a sparse coding step based on the OMP method and a dictionary updating step based on a simple singular value decomposition (SVD). The reader can refer to [23, 24] for some details.

2.3.2 Wavelet-based image denoising method

Wavelet-based methods have proved to be effective in noise removal. These methods are mainly based on thresholding the discrete wavelet transform coefficients, which have been affected by additive white Gaussian noise [6]. As shown in Fig. 3, the basic denoising algorithms that use DWT consist of three steps:

  • The discrete wavelet transform is adopted to decompose the noisy image and get the wavelet coefficients.

    Fig. 3
    figure 3

    The basic framework of the wavelet transform based image denoising

  • The wavelet coefficients are denoised by using the wavelet thresholding technique.

  • The inverse discrete wavelet transform is applied to the modified coefficients to get the denoised image.

2.3.2.1 Wavelet thresholding approach for image denoising

Let the original image be {fij} of size M×N, where M, N is some integer power of 2. It has been corrupted by an additive zero-mean white and homogeneous Gaussian noise {nij}, with standard deviation σ, and one observes:

$$\begin{array}{@{}rcl@{}} g_{ij} =f_{ij}+n_{ij} \end{array} $$
(13)

The goal is to estimate the denoised image {fij′} from noisy observation {gij}.

It is convenient to label the sub-bands of the transform as in Fig. 2. The sub-bands HHk, HLk, and LHk are called the details, where k is the level ranging from 1 to J, where J is the largest level. The sub-band LLJ is the low resolution residual. The wavelet transform is applied to (13) to obtain the wavelet coefficients cij. Then, the wavelet thresholding technique is used to filter each wavelet coefficient from the detail sub-bands with a threshold function to obtain the modified coefficients. Finally, the inverse wavelet transform is applied to the modified coefficients to get the denoised image. There are two thresholding methods frequently used. The soft-thresholding function (also called the shrinkage function):

$$\begin{array}{@{}rcl@{}} \eta_{T}(x)=sgn(x).max(|x|-T,0) \end{array} $$
(14)

takes the argument and shrinks it towards zero by the threshold T.

The other popular alternative is the hard-thresholding function:

$$\begin{array}{@{}rcl@{}} \psi_{T}(x)=x.1 \left\lbrace |x|>T\right\rbrace \end{array} $$
(15)

which keeps the input if it is larger than the threshold; otherwise, it is set to zero. The wavelet thresholding procedure removes noise by thresholding only the wavelet coefficients of the detail sub-bands, while keeping the low-resolution coefficients unaltered [7, 30].

Donoho et al. [8] have discussed a simple, but influential wavelet-based denoising pattern known as VisuShrink. The outcomes of VisuShrink are stable along with an alluring visual feature. VisuShrink uses the universal threshold, T, which is proportional to the standard deviation of the noise, and it is defined as follows [9]:

$$\begin{array}{@{}rcl@{}} T =\sigma\sqrt{2logS} \end{array} $$
(16)

where σ2 is the noise variance, defined as follows:

$$\begin{array}{@{}rcl@{}} \sigma^{2}=[(median |c_{ij}|)/0.6745]^{2} \end{array} $$
(17)

where cijHH1 sub-band thresholding.

S: Number of pixel for the test image.

2.3.3 Total variation denoising method

TV-based regularization is the most influential research in the field of image denoising. TV regularization is based on the statistical fact that natural images are locally smooth, and the pixel intensity gradually varies in most regions. It has achieved great success in image denoising because it cannot only effectively calculate the optimal solution but also retain sharp edges.

The l1 total variation functional was first introduced by Rudin, Osher, and Fatemi (ROF) [40] to address image-denoising problems. It is formulated as:

$$\begin{array}{@{}rcl@{}} \min_{\hat{f}}\parallel\bigtriangledown\hat{f}\parallel_{1} \hspace{0.1cm} such \hspace{0.08cm} that \hspace{0.1cm} \parallel \hat{f}-f \parallel_{2}^{2}\leq\sigma \end{array} $$
(18)

where \(\hat {f}\) is the reconstructed denoised image, f is the noisy image, ∥.∥1 is the l1-norm, ∥.∥2 is the l2-norm, and σ is an error tolerance included to account for noisy data. The choice of technique for solving l1 regularization-based problems is crucial, as l1 is nonlinear; therefore, the computational burden can increase significantly using classic gradient-based methods. The recently published Split Bregman (SB) method [27, 41] is a simple and efficient algorithm for solving l1 regularization-based problems that makes it possible to split the minimization of l1 and l2 functionals. By applying the SB method to image denoising proposed in [27], the authors showed that the TV-denoising problem solved using SB formulation was computationally efficient, given that the SB formulation leads to a problem that can be solved using Gauss–Seidel and Fourier transform methods.

2.3.4 BM3D denoising method

The BM3D method [28] is a famous denoising method. It has become a baseline algorithm to test the performance of denoising algorithms. BM3D includes the following steps: the first step is block matching: for each image block located at j, the similar image blocks with size \(\sqrt {n}\times \sqrt {n}\) are collected in groups with member number Ij. Image blocks in each group are stacked together to form a \(\sqrt {n}\times \sqrt {n}\times I_{j}\) 3D data array. Second, the sparsity regularization step, the 3D arrays are decorrelated by using an invertible sparse 3D transform such as discrete cosine transform (DCT) [42] and then are filtered by thresholding or Wiener filtering. Finally, the restoration is obtained by aggregating all the estimated image patches.

2.4 Proposed denoising method

Suppose the input noisy image y is from a clean image x contaminated by additive zero-mean white and homogenous Gaussian noise z, with standard deviation σ. The measured image y is, thus

$$\begin{array}{@{}rcl@{}} y=x+z \end{array} $$
(19)

We desire to design an algorithm that can remove the noise from y, getting a denoised image \(\hat {x}\) as close as possible to the original image x. As shown in Fig. 4, the proposed denoising method consists of three steps:

  • In the first step, discrete wavelet transform is applied to the noisy image y, to get the approximation and wavelet coefficients.

    Fig. 4
    figure 4

    Block diagram of the proposed denoised method

  • In the second step, approximation and wavelet coefficients are denoised by using an adaptive dictionary learned on the set of extracted patches from wavelet representation of the noisy image, by the K-SVD as a dictionary learning algorithm.

  • In the third step, inverse discrete wavelet transform is applied for reconstructing the image, which results in the denoised image \(\hat {x}\).

The detailed steps of the proposed method can be given bellow:

In (19), we assume clear image x of size \(\sqrt {n}\times \sqrt {n}\) pixels and then use 2D discrete wavelet transform on the noisy image y. According to (19) and the property of wavelet transform, we have:

$$\begin{array}{@{}rcl@{}} Wy=Wx+Wz \end{array} $$
(20)

where Wy, Wx, and Wz are the wavelet transform of y, x, and z, respectively. Just like the K-SVD algorithm, learning dictionary on small image patches, note each small patch Wxij=RijWx of size \(\sqrt {n}\times \sqrt {n}\) in every location (i,j) of Wx. The matrix Rij is an n×N matrix that extracts the (ij) block size of n×1 pixels from the image Wx. For the image Wx of size \(\sqrt {n}\times \sqrt {n}\), the summation over i,j includes \((\sqrt {N}-\sqrt {n}+1)^{2}\) items, considering all image patches of size \(\sqrt {n}\times \sqrt {n}\) in Wx with overlaps. When redundant dictionary DRn×k is given, according to the sparse prior of wavelet coefficients, every patch Wxij has a sparse representation with bounded error, we get:

$$\begin{array}{@{}rcl@{}} \hat{\alpha}_{ij}=\arg\min_{\alpha_{ij}}\parallel\alpha_{ij}\parallel_{0} \hspace{0.1cm} s.t.\hspace{0.1cm} \parallel {Wx}_{ij}-D\alpha_{ij} \parallel_{2}^{2}\leq (C\sigma)^{2} \end{array} $$
(21)

The Lagrange form of it is:

$$\begin{array}{@{}rcl@{}} \hat{\alpha}_{ij}=\arg\min_{\alpha_{ij}}\parallel {Wx}_{ij}-D\alpha_{ij} \parallel_{2}^{2} + \mu_{ij}\parallel\alpha_{ij}\parallel_{0} \end{array} $$
(22)

By maximum a posteriori estimation, denoising of Wy is equivalent to the energy minimization problem:

$$\begin{array}{@{}rcl@{}} \left\lbrace \hat{\alpha}_{ij},W\hat{x}\right\rbrace =\arg\min_{\alpha_{ij},Wx}\lambda\parallel Wx-Wy \parallel_{2}^{2} + \sum_{ij}\mu_{ij}\parallel\alpha_{ij}\parallel_{0} + \sum_{ij}\parallel R_{ij}Wx-D\alpha_{ij} \parallel_{2}^{2} \end{array} $$
(23)

In this expression, the first term is the log-likelihood between Wx and Wy, and λ depends on \(\parallel Wx-Wy \parallel _{2}^{2}\leq Cst\times \sigma ^{2} \). The second and the third terms are the sparse prior of wavelet coefficients.

When D is known, we can solve (23) by two steps. First, let Wx=Wy, thus (23) is equivalent to solve \((\sqrt {N}-\sqrt {n}+1)^{2}\) problems of (22) which can be solved by OMP. This step is called sparse coding. Second, when getting \(\hat {\alpha }_{ij}\) for all (ij) locations, fix them and turn to update Wx. Returning to (23), we need to solve:

$$\begin{array}{@{}rcl@{}} W\hat{x} =\arg\min_{Wx}\lambda\parallel Wx-Wy \parallel_{2}^{2} + \sum_{ij}\parallel R_{ij}Wx-D\hat{\alpha}_{ij} \parallel_{2}^{2} \end{array} $$
(24)

This is a simple quadratic term that has a closed-form solution of the form:

$$\begin{array}{@{}rcl@{}} W\hat{x} =\left(\lambda I + \sum_{ij}R_{ij}^{T}R_{ij}\right)^{-1}\left(\lambda Wy + \sum_{ij}R_{ij}^{T}D\hat{\alpha}_{ij}\right) \end{array} $$
(25)

This expression says the averaging of the denoised patches and is called patches averaging step. Given the updated, we can repeat the sparse coding stage, working on the already denoised patches. Once this is done, a new averaging should be calculated, and so on, and so forth. Finally, the satisfying \(\hat {\alpha }_{ij}\) and \(W\hat {x}\) are obtained. In practice, the dictionary D is unknown and we can get it by learning which is same to [23]. After fusion dictionary learning in (23), we get the denoising model as follows:

$$\begin{array}{@{}rcl@{}} \left\lbrace \hat{\alpha}_{ij},\hat{D},W\hat{x}\right\rbrace &=&\arg\min_{\alpha_{ij},D,Wx}\lambda\parallel Wx-Wy \parallel_{2}^{2} +\sum_{ij}\mu_{ij}\parallel\alpha_{ij}\parallel_{0}\\ &&\quad+ \sum_{ij}\parallel R_{ij}Wx-D\alpha_{ij} \parallel_{2}^{2} \end{array} $$
(26)

In this work, the K-SVD algorithm is used to learn and update the initial redundant DCT dictionary, and Wyij is the training set. As so far, we can solve (26) as follows: (a) Given the initial dictionary D and let Wx=Wy, then compute \(\hat {\alpha }_{ij}\) by (21); (b) update initial dictionary D to \(\hat {D}\) using K-SVD; and (c) compute \(W\hat {x}\) by (25). Finally, let \(\hat {D}\) and \(W\hat {x}\) are the initial D and Wx, then repeat the above process until getting the denoised image \(W\hat {x}\). As so far, we have finished the denoising process in the wavelet domain, the 2D inverse discrete wavelet transform is used to get the restoration image \(\hat {x}\). The following sequence defines our proposed algorithm:

3 Results and discussion

In this section, we aim to demonstrate the advantages and the performance that our proposed wavelet denoising approach based on unsupervised learning model has.

Our proposed algorithm is evaluated and compared with four representative and state-of-the-art denoising algorithms: the wavelet thresholding approach for image denoising, the sparse representation-based K-SVD denoising method in the image domain, the TV denoising method, and the BM3D denoising method.

In our experiments, we choose five well-known images as test images, including “Barbara,” “House,” “Flinstones,” “Bridge,” and “Fingerprint.” Each image is contaminated by adding zero-mean white Gaussian noise with various deviations. These following parameters are used in our proposed method: for the choice of the best wavelet basis, we test several wavelets under each noise level for all images and take the results with the highest PSNR for comparison, as a result, for the wavelet transform and inverse wavelet transform, we choose the Coiflet wavelet transform (MATLAB ’coif’) for the two images: “House,” “Fingerprint,” and Symlet wavelet transform (MATLAB ’sym’) for the rest of images, the size of patches n=64, the number of dictionary elements k=256, the Lagrange multiplier λ=30/σ, and the noise gain C=1.15.

Since we evaluate denoised images with the measures of the PSNR and SSIM index, the formulas of the PSNR and SSIM are also given. The PSNR is estimated based on the mean square error (MSE) as:

$$\begin{array}{@{}rcl@{}} MSE=\frac{1}{N_{1}N_{2}}\sum_{i=1}^{N_{1}}\sum_{j=1}^{N_{2}}\left[ I_{1}(i,j)- I_{2}(i,j) \right]^{2} \end{array} $$
(27)
$$\begin{array}{@{}rcl@{}} PSNR=10\times \log \left(\frac{255^{2}}{MSE}\right) \end{array} $$
(28)

where I1 and I2 denotes two images of size N1×N2, I1(i,j) represents the pixel value on the ith row and jth column of I1. The SSIM index is defined as:

$$\begin{array}{@{}rcl@{}} SSIM(I_{1},I_{2})=\frac{(2\mu_{I_{1}} \mu_{I_{2}}+C_{1})(2\sigma_{I_{1}I_{2}}+C_{2})}{(\mu_{I_{1}}^{2}+\mu_{I_{2}}^{2}+C_{1})(\sigma_{I_{1}}^{2}+\sigma_{I_{2}}^{2}+C_{2})} \end{array} $$
(29)

where \(\mu _{I_{1}}\) and \(\sigma _{I_{1}}\) denote the average gray values and the variance for I1, \(\sigma _{I_{1}I_{2}}\) represents the covariance between the two images I1 and I2, and the symbols C1 and C2 are two small constants [43]. The higher the value of PSNR or SSIM, the closer is the denoised image I2 to the original image I1, hence the more accurate is the denoising.

The comparison of quantitative results is shown in Tables 1 and 2. A visual comparison is shown in Figs. 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, and 16.

Fig. 5
figure 5

Denoising performance comparisons of “Barbara” with the noise deviation σ=10 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 6
figure 6

Denoising performance comparisons of “House” with the noise deviation σ=10 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 7
figure 7

Denoising performance comparisons of “Barbara” with the noise deviation σ=50 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 8
figure 8

Denoising performance comparisons of “Barbara” with the noise deviation σ=70 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 9
figure 9

Denoising performance comparisons of “House” with the noise deviation σ=50 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 10
figure 10

Denoising performance comparisons of “House” with the noise deviation σ=70 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 11
figure 11

Denoising performance comparisons of “Flinstones” with the noise deviation σ=10 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 12
figure 12

Denoising performance comparisons of “Flinstones” with the noise deviation σ=70 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 13
figure 13

Denoising performance comparisons of “Bridge” with the noise deviation σ=10 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 14
figure 14

Denoising performance comparisons of “Bridge” with the noise deviation σ=70 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 15
figure 15

Denoising performance comparisons of “Fingerprint” with the noise deviation σ=10 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Fig. 16
figure 16

Denoising performance comparisons of “Fingerprint” with the noise deviation σ=70 by different methods. a Noisy image. b Denoised image by wavelet thresholding. c Denoised image by TV. d Denoised image by K-SVD. e Denoised image by BM3D. f Denoised image by the proposed method

Table 1 The PSNR results of the denoised images by different denoising schemes
Table 2 The SSIM results of the denoised images by different denoising schemes

Let’s first see the PSNR and SSIM results by different methods on the five test images. The statistical results of the five methods are given in the PSNR measure in Table 1. The results show that the proposed method outperforms the three denoising methods: wavelet thresholding, TV, and K-SVD, in all cases. Compared to the BM3D, for the two images: “Barbara” and “House”, the proposed method has the lowest PSNR value under some noise level variations, but for the rest of images, the proposed method has the higher PSNR measures than BM3D. The statistical results of the five methods are also recorded in the SSIM measure as shown in Table 2. Based on these results, we can deduce that our proposed method is always better than the other denoising methods, which assures the efficiency of our algorithm.

Let us then focus on the visual quality evaluation of these denoising algorithms. We take two example images: “Barbara” and “House”, to show that although BM3D has higher PSNR measures than the proposed method under some noise level variations for these two images, the proposed method has better visual quality than the BM3D method. To show the performance under different levels of noise, for the two images: “Barbara” and “House”, we show the results under low-level noise with standard deviation σ=10 and high-level noise with σ=50 and σ=70. For the rest of images, we show the denoising results with the noise deviation σ=10 and σ=70, by different methods. The Figs. 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, and 16 demonstrate the denoising results produced by the proposed method and the state-of-the-art denoising methods: wavelet thresholding, TV, K-SVD, and BM3D. The results show that under low-level noise with σ=10, the denoised images by the proposed method and the BM3D method are very similar in real visual perception, and they have much better visual quality than all the other methods. Under high noise levels, the results show that the edges are well preserved, the textures and more details are better restored, and least artifacts exist in the result of our proposed method. The performance of the wavelet thresholding approach, the K-SVD, and the TV denoising method is the worst in various noise levels; it is easy to see that the restored images contain too many artifacts, the image edges and flat areas are blurred, and a number of details and textures are lost. It also found in the experiments that the proposed method has much better visual quality than the BM3D method that produces too many artificial ringing effects which are caused by stacking the image blocks and erroneous grouping.

Depending on the size of the image, the execution of the main proposed algorithm requires an average of 17 s to 2 min on Intel(R) Core(TM) i3-2330M CPU 2.20 GHz computer. The average time required by the other algorithms to complete their executions is given as follows: the wavelet thresholding approach for image denoising: 2 to 5 s, the sparse representation-based K-SVD denoising method in the image domain: 15 s to 2 min, the TV denoising method: 12 to 30 s, and the BM3D denoising method: 5 to 15 s. The proposed algorithm is slower than the other denoising methods, due to the use of the K-SVD as a dictionary learning algorithm that needs many iteration steps.

Our proposed method performs better than the state-of-the-art denoising methods, due to the merits of the wavelet transform and to the use of an adaptive dictionary devoted to noise reduction instead of using the thresholding operator.

4 Conclusion

We propose in this work a new method for image denoising. The approach taken aims at exploiting the merits of the wavelet transform: sparsity, multi-resolution structure, similarity with the human visual system, and good localization properties both in space and frequency, to adapt an unsupervised dictionary learning algorithm for creating a dictionary devoted to eliminate the useless information while keeping most significant ones. Experiments illustrate that the proposed method achieves a better performance than some other well-developed denoising methods, especially in PSNR, SSIM index, and visual effects. In future work, we will be focused more on reducing the computational cost of the proposed model.