Abstract

A target recognition method for synthetic aperture radar (SAR) image based on complex bidimensional empirical mode decomposition (C-BEMD) is proposed. C-BEMD is used to decompose the original SAR image to obtain multilevel complex bidimensional intrinsic mode functions (BIMF), which reflect the two-dimensional time-frequency characteristics of the target. In the classification stage, the decomposed multilevel BIMFs are represented using the multitask sparse representation. Finally, the target category of the test sample is determined according to the reconstruction errors related to different training classes. In the experiment, the standard operating condition (SOC) and extended operating conditions (EOC) are designed based on the MSTAR dataset to test and verify the proposed method. The results confirm the effectiveness and robustness of the method.

1. Introduction

Synthetic aperture radar (SAR) image processing has potential value in both military and civil fields [1]. SAR target recognition technology determines the target category through the analysis of target characteristics in images, which can be performed using template-based [2, 3] and mode-based ways [47]. Feature extraction is one of the key steps in SAR target recognition, which mainly realizes the extraction and representation of target characteristics. At this stage, commonly used SAR image features include geometric ones, transformation ones, and electromagnetic ones. In [817], the region (including the target and shadow ones), contours were used for SAR target recognition for describing the shape distributions. The transformation features can be summarized into two categories. One type was processed by mathematical projection algorithms [1822], typically like principal component analysis (PCA) [18] and nonnegative matrix factorization (NMF) [20]. The other type used image decomposition through signal processing algorithms, such as monogenic signal [23] and bidimensional empirical mode decomposition (BEMD) [24]. In [25], the visual saliency model was employed for discriminative feature learning. The electromagnetic features describe the backscattering characteristics of the target reflected in the SAR imaging process, such as polarization [26] [27] and scattering centers [2831]. The classifier is another key step of SAR target recognition. The classification mechanism is designed to classify the extracted features. A large number of classifiers have been used and verified in SAR target recognition, including support vector machines (SVM) [32, 33] and sparse representation-based classification (SRC) [3436]. In recent years, with the maturity of deep learning theory and algorithms [3739], a large number of SAR target recognition methods were developed based on deep leaning models, among which the most representative one was the convolutional neural network (CNN) [4046].

The results of feature extraction, as the input of the classifier, largely determine the classification accuracy. Therefore, designing and proposing new SAR image feature extraction algorithms are of great significance for target recognition. This paper proposes a SAR target recognition method based on complex BEMD (C-BEMD) [47, 48]. C-BEMD is an extension of the traditional EMD [49, 50] and BEMD [24, 49] to the complex domain, which can be directly used for the processing and analysis of complex images. In [24], the authors applied BEMD for SAR image decomposition and target recognition, which verified the effectiveness. However, SAR images are filled with complex values with both amplitude and phase information. The sole use of image intensities would lose the discrimination of the phase distribution. In this sense, C-BEMD can more effectively reflect the two-dimensional time-frequency characteristics of the target, thereby providing more sufficient information for the following classification. For the bidimensional intrinsic mode functions (BIMF) obtained by C-BEMD, this paper adopts the multitask sparse representation for decision-making in the classification stage. The multitask sparse representation is a general extension of traditional single-task one, which considers and makes use of the relationship between several tasks. For the BIMFs decomposed by C-BEMD, their inner correlations can be employed by the multitask sparse representation, thereby improving the overall reconstruction accuracy. Finally, the target category of the test sample is determined according to the total reconstruction errors of all the BIMFs from the test sample achieved by individual training classes. In the experiments, a variety of operating conditions were set up based on the MSTAR dataset to test the proposed method. The experimental results verify the effectiveness and robustness of this method.

2. Basics of C-BEMD

Yeh first developed EMD to adaptively analyze the nonstationary signals [48]. Unlike traditional signal decomposition methods, e.g., wavelet analysis, EMD does not impose any prior assumptions on the data, such as linearity or stationarity. In the past researches, EMD has been numerically validated to be more capable of describing patterns in nonstationary and nonlinear signals. As a natural generalization of EMD to 2D space, BEMD is capable of describing an image using several BIMFs [24, 49]. The original image is decomposed into high- and low-frequency components with some residues. Hence, the generated BIMFs could better reflect the global and detailed information of the decomposed image. However, the traditional BEMD are designed for real signals and cannot process complex signals or images. Yeh extended BEMD to the complex domain to enable it for directly decomposing the complex matrix. According to [47, 48], the specific implementation process of C-BEMD algorithm can be summarized as following steps.

Step 1. Construction of a two-dimensional band-pass filter aswhere represents a matrix with the sizes of ; was a zero matrix with the sizes of . The values of and are determined as follows:

Step 2. Construction of 4 analytic signals aswhere denotes the two-dimensional Fourier transform of the input image .

Step 3. Get the two-dimensional inverse Fourier transforms of and and extract their real parts as and . Get the two-dimensional inverse Fourier transforms of and and extract their imagery part as and .

Step4. Apply BEMD to decompose , , , and , respectively. The obtained BIMFs are denoted as , , , and , with the numbers of decompositions as , .

Step 5. Apply to process the and obtain the complex BIMF asIn equation (4), the function is defined asThe detailed deduction and implementation of C-BEMD can be found in [45].
In this paper, C-BEMD is applied to the decomposition of complex SAR images, and the two-dimensional time-frequency characteristics of the targets are described through multilevel BIMFs. Figure 1 decomposes a SAR image (shown in Figure 1(a)) from the MSTAR dataset with the amplitude parts of the first three BIMFs shown in Figures 1(b)1(d), respectively. It can be seen that the decomposition results can effectively describe the characteristics associated with the target, while forming an effective complement to the original image with more detailed information. Therefore, this paper jointly uses the original image and BIMFs decomposed by C-BEMD for the following classification.

3. Classification of Multilevel BIMFs for Target Recognition

3.1. Multitask Sparse Representation

The multitask sparse representation can be considered as a united and compact form of several related sparse representation tasks [51, 52]. With the constraint of inner correlations, the multitask sparse representation could produce more precise and robust solutions than those from individual tasks. As reported in [5357], the multitask sparse representation has been successfully applied to SAR target recognition to classify multiple views, features, resolutions, etc. This study employs it for the classification of multilevel BIMFs generated by C-BEMD. Assume there are M BIMFs denoted as , which are from the same test sample , and they are represented based on the sparse representations aswhere forms the dictionary of the BIMF; stores the coefficient vectors of all the BIMFs.

Equation (6) aims to minimize the total reconstruction error but neglects the correlations between different tasks. As the decompositions are from the same image, different levels of BIMFs are actually correlated. So, the core of the multitask sparse representation falls on the constraint on the coefficient matrix and the optimization problem is changed to bewhere calculates the norm of the coefficient matrix; is a nonnegative constant acting as the regularization parameter.

As validated, the coefficient vectors of different components solved by equation (7) tend to share similar patterns originated from their inner correlations. From reports in related researches [5357], such modifications effectively improve the reconstruction precision, especially for pattern recognition problems. With the estimation of the coefficient matrix , the reconstruction errors of the test sample with respect to individual training classes can be obtained for the determination of target category aswhere extracts the subdictionary of the lth BIMF in the class; infers to the corresponding coefficients.

3.2. Target Recognition

Figure 2 shows the basic flow of the proposed method for SAR target recognition. The training samples are first decomposed by C-BEMD to obtain the multilevel BIMFs, and a global dictionary is constructed for each of them accordingly. For the test sample, the same C-BEMD is used to decompose the corresponding levels of BIMFs. Then, the BIMFs of the test sample are jointly represented with the support of the constructed dictionaries. Finally, the target category of the test sample is determined according to the reconstruction errors from equation (8).

In the actual operation process, the BIMFs obtained by C-BEMD are complex ones with both amplitude and phase parts, so they are extracted and used separately as well as the original SAR image. As shown in Figure 2, the K BIMFs for the dictionaries come from K/2 decompositions by C-BEMD, where the former K/2 represent the amplitude and latter K/2 infer to the phase. Especially in this paper, the first three BIMFs are used together with the original SAR image as shown in Figure 1. Both the global and local information of SAR targets can be characterized by these components. Therefore, the proposed method can make full use of the two-dimensional time-frequency characteristics of complex SAR images to improve the final recognition performance.

4. Experiments

4.1. Preparation

The proposed method is tested based on the MSTAR dataset. The dataset contains SAR images of the 10 types of targets shown in Figure 3, covering 0°∼360° azimuth angles and typical depression angles such as 15°, 17°, 30°, and 45°. Due to the abundant data samples, the MSTAR dataset has long been the benchmark data source for the verification of SAR target recognition methods. According to the existing researches, this study relies on the MSTAR dataset to set up typical operating conditions for experiments, including standard operating condition (SOC), extended operating conditions (EOC) to be configuration differences, depression angle differences, and noise interference.

Table 1 shows the training and test samples for the 10-class classification task under SOC, which come from 17° and 15° depression angles, respectively. All types of targets are from the same configurations, with only a small difference in the depression angles. Therefore, the test and training samples tend to share high similarities, so the recognition problem is relatively simple. Table 2 sets the training and test samples under the condition of configuration differences, including 3 types of targets. Among them, the training and test samples of BMP2 and T72 come from completely different configurations. Table 3 shows the training and test samples from different depression angles. In this case, the training samples are from 17° depression angle but the test ones are from 30° and 45°, respectively. In addition, on the basis of the experimental setting in Table 1, noises are added to the test samples to generate test sets of different signal-to-noise ratios (SNR) [29]. Then, the proposed method can be evaluated under noise interference. Figure 4 shows some noisy SAR images at different SNRs, where the influences of noises can be observed on the target appearances.

Six types of reference methods are selected from existing researches to be simultaneously compared with the proposed one under same conditions. The first one comes from [24], which employed BEMD for SAR image feature extraction. The second one used visual saliency model for feature extraction, denoted as VSM method. The third and fourth methods are CNN-based ones, using the residual networks (Res-Net) [42] and deep feature [46], respectively. The last two are developed based on the multitask sparse representations to classify the multiple features (extracted by PCA, kernel PCA, and NMF) [55] and multiresolution representations [56]. They are abbreviated as “multifeature” and “multiresolution,” respectively. The following experiments are conducted sequentially under SOC and three EOCs. All the methods are compared with quantitative results to reach some effective conclusions.

4.2. SOC

Relying on the experimental setup in Table 1, the proposed method is tested and verified under SOC. Figure 5 shows the classification confusion matrix of the 10 types of targets. The horizontal and vertical coordinates in the figure represent the true and the prediction labels of the test samples, respectively. Hence, the diagonal elements are the classification accuracies of different targets. This study defines the recognition rate as , where and denote the numbers of correctly-classified and total test samples, respectively. The average recognition rates on the 10 types of targets achieved by various methods are shown in Table 4. It can be seen that different methods can achieve high recognition performance under SOC. In contrast, the proposed method is better than the five reference methods with an average recognition rate of 99.34%. Under SOC, the training sample and the test sample have high similarities, which can effectively cover various situations in the test set, which contributes to the good performance of CNN-based methods, i.e., Res-Net and deep feature. In particular, compared with the BEMD method, this paper uses C-BEMD to effectively explore the time-frequency characteristics of the original complex SAR image and obtain more effective feature descriptions. Therefore, the final performance of the proposed method is better than the BEMD method. For multifeature and multiresolution methods, they used the multitask sparse representation in the classification stage, same with the proposed one. The higher of the proposed one shows that the BIMFs decomposed by C-BEMD have higher discriminability than the multiple features or resolutions.

4.3. Configuration Differences

Relying on the experimental setup in Table 2, the proposed method is tested and verified under configuration differences. Table 5 lists the classification results with respect to each configuration from BMP2 and T72 with the , and the average recognition rate of all the configurations reach 98.52%. The recognition performance of various methods is shown in Table 6. Compared with the SOC case, the s of different methods decreases to some extent because of the configuration differences. Specially, Res-Net and deep feature methods have the significant falls with inadequate training samples to cover the situations in the test set. In contrast with the traditional BEMD method, the of the proposed method has some improvements, which proves that C-BEMD can more effectively extract the complex-domain features of SAR images, thereby improving the overall recognition performance. Also, the better performance than the multifeature and multiresolution methods validates the higher discrimination of BIMFs decomposed by C-BEMD.

4.4. Depression Angle Differences

Relying on the experimental setup in Table 3, the proposed method is tested under the condition of depression angle differences. All the methods perform the classifications at 30° and 45° depression angles, respectively, and the results are summarized as Figure 6. At the depression angle of 30°, the average recognition rates of various methods can be maintained above 93%, indicating that the image differences caused by the depression angle difference are relatively small at this time. However, at the 45° depression angle, the performance of various methods drops significantly. In this situation, the image differences caused by the depression angle difference are much more significant. The proposed method in this paper maintains the highest at both cases, which proves its robustness to depression angle differences. Compared with the BEMD method, the performance of this paper has been greatly improved, which illustrates the effectiveness of C-BEMD for SAR image feature extraction. With significant differences between the training and test samples, the Res-Net and deep feature methods experience the largest degradations at 45° depression angle among all the methods because the trained networks can hardly discern those test samples with low similarity with the training ones.

4.5. Noise Interference

Relying on the constructed noisy test sets at multiple SNRs, the proposed method is evaluated under noise interference conditions. Figure 7 plots the curves of average recognition rates achieved by different methods with reference to SNR. The proposed method achieves the highest at each noise level, which verifies its robustness to noise interference. The C-BEMD used in this paper analyzes and extracts features from SAR images in the complex domain and finally obtains features that are robust against noise interference. During the decomposition, a certain denoising process is actually carried out, which can be observed in the implementation steps of C-BEMD. In the classification process, the discriminations of different BIMFs are combined and fused through the multitask sparse representation. Therefore, the proposed method can maintain a high level of performance under noise interference. Among the five reference methods, the BEMD method outperforms the remaining ones, further validating the noise robustness of the decomposed BIMFs. The two CNN-based methods achieve the lowest s, especially at low SNRs, because the networks trained by SAR images at high SNRs have weak adaptivity to those test sets with much noise.

5. Conclusion

This paper applies C-BEMD to SAR image feature extraction and target recognition. C-BEMD is an extension of traditional BEMD in the complex domain and can be directly used to process complex matrix. In this paper, C-BEMD is used to extract the features of complex SAR images with multilevel complex BIMFs, which can effectively reflect the time-frequency characteristics of SAR targets. In the classification stage, the multitask sparse representation is used to characterize the extracted BIMFs, and the target category of the test sample is determined according to the reconstruction errors. Based on the MSTAR dataset, the proposed method is tested and verified under SOC and typical EOCs including configuration differences, depression angle differences, and noise interferences. The experimental results reflect the highest average recognition rate of the proposed with 99.34% under SOC. Also, the robustness under the three EOCs is also higher than the reference methods.

Data Availability

The MSTAR dataset used to support the findings of this work is available online at http://www.sdms.afrl.af.mil/datasets/mstar/.

Conflicts of Interest

The authors declare that they have no conflicts of interest.