Abstract

Considering the defaults in synthetic aperture radar (SAR) image feature extraction, an SAR target recognition method based on non-subsampled Shearlet transform (NSST) was proposed with application to target recognition. NSST was used to decompose an SAR image into multilevel representations. These representations were translation-invariant, and they could well reflect the dominant and detailed properties of the target. During the machine learning classification stage, the joint sparse representation was employed to jointly represent the multilevel representations. The joint sparse representation could represent individual components independently while considering the inner correlations between different components. Therefore, the precision of joint representation could be enhanced. Finally, the target label of the test sample was determined according to the overall reconstruction error. Experiments were conducted on the MSTAR dataset to examine the proposed method, and the results confirmed its validity and robustness under the standard operating condition, configuration variance, depression angle variance, and noise corruption.

1. Introduction

Feature extraction is one of the key technologies for synthetic aperture radar (SAR) image data target recognition [1]. SAR is necessary to store radar echo. Because the data are not collected at the same time, it is necessary to calculate the signal received in a certain time interval. After A/D conversion, the amount of digital signal is stored, and the selection of storage medium must take into account the rate of information record, the big data capacity of record, and the reading speed of lots of data stored when azimuth compression and pulse compression are completed.

Designing appropriate features can not only effectively maintain the target characteristics in the SAR image, but also significantly reduce the redundant information in the image, thereby, improving the accuracy and efficiency of subsequent classification. At this stage, researchers have designed a large number of reliable features for the SAR target recognition problem, which can be divided into geometric shape features [24], electromagnetic features [57], and transform domain features [811]. The geometric shape feature describes the characteristics of the target’s geometric size and shape distribution. Commonly used metods include target area, contour, radar shadow, and feature vectors. Target binary area was used as the basic feature to design SAR target recognition method in [2]. A recognition method was designed based on SAR target contour in [3]. Electromagnetic characteristics reflect target characteristics associated with electromagnetic scattering phenomena, and typical representatives are polarization mode, scattering center, etc. Ding et al. [5] improved the performance of SAR target recognition by introducing polarization information. Ding et al. and Zhang et al. [6, 7] extracted the scattering center of the target based on the attribute scattering center model and then identified the target type by matching the scattering center. The transform domain feature uses mathematical calculations and signal processing methods to analyze the amplitude and phase distribution of the SAR image, thus greatly reducing the redundant information. One type uses matrix projection methods, such as principal component analysis (PCA) [8] and nonnegative matrix factorization (NMF) [9]. The other is based on the idea of transform domain, wavelet transform [10], unicast signal [11], and so on. Based on obtaining the effective features of SAR images, a suitable classifier is selected to classify them, so as to achieve the purpose of identifying target categories of unknown samples. Common machine learning used classifiers include K nearest neighbors (KNN) [8], support vector machines (SVM) [12], sparse representation classification (SRC) [13], and deep neural networks [1416]. Most of the current feature extraction methods are not comprehensive enough for the analysis of SAR images, and they can only reflect the characteristics of one aspect of the target. Obtaining multilevel features through comprehensive analysis of SAR images will help improve the performance of subsequent classification.

The image is decomposed by non-subsampled shearlet transform (NSST) to obtain multiple offspring images. These child images have the same size as the original image, one of which is a low-pass component, describing the main information of the original image. The remaining child images are high-pass components, reflecting the detailed information in the original image. At the same time, these progeny images have multiscale description capabilities and good translation invariance. In view of these excellent characteristics, NSST has been widely used in image fusion, denoising, recognition, and other fields [1722]. Combining multiple progeny NSST images can provide more comprehensive information for SAR targets, thereby providing stronger support for subsequent classification and recognition. In this paper, joint sparse representation is used to represent images of multiple progenies. Joint sparse representation is a multitask learning algorithm [23, 24], which can express each task component independently while exploring the relationship between them, so it is helpful to provide overall reconstruction accuracy. Finally, the target category of the test sample is determined according to the sum of the reconstruction errors of each child image. Aiming at the shortcomings of existing SAR image feature extraction, this paper proposes an SAR target recognition method based on NSST feature extraction.

2. Methods

2.1. Non-Subsampled Shearlet Transform (NSST)

The traditional Shearlet transform is proposed based on synthetic wavelet theory and multiscale analysis, which could multiscale signal analysis. However, Shearlet transform does not have translation invariance, which limits its flexible application in image analysis and other fields. To this end, researchers proposed NSST, which is composed of a combination of non-subsampled pyramid (NSP) filters based on an improved cut filter bank (SF). For image data with dimension n = 2, the affine system for synthetic expansion iswhere ; F and H are 2 × 2 invertible matrix, . If has a tight frame, then the elements in are called synthetic wavelets. is the anisotropic expansion matrix, is associated with scale transformation; is the shear matrix, and is associated with a geometric transformation that keeps the area constant. When , , the synthetic wavelet at this time is called shear wave. Figure 1 shows the basic schematic diagram of NSST. The detailed decomposition process can be found in the literature [1722].

According to the basic properties and decomposition process of NSST, a multilevel decomposition structure can be obtained when it is applied to SAR image decomposition. These results have multiscale analysis capabilities, which provide richer information for the characterization of targets in SAR images. In addition, the decomposition result also has translation invariance, which overcomes the possible position deviation caused by target centering in SAR images. Therefore, the SAR image features extracted based on NSST help to improve the overall accuracy and robustness of subsequent target recognition.

2.2. SAR Target Recognition Method Combined With Multilevel NSST Progeny Images
2.2.1. Joint Sparse Representation

NSST can decompose the multilevel progeny image of the original image, which can provide more sufficient information for describing the target characteristics. To make full use of this information, this paper uses joint sparse representation to jointly characterize these offspring images. Joint sparseness uses a multitask learning algorithm to examine the internal relationships of multiple related tasks, thereby improving the overall representation accuracy. The K different feature components of the test sample y are . They can be sparsely represented based on the corresponding dictionary, respectively:

Given that, is the dictionary corresponding to features and is the corresponding sparse representation coefficient vector. Without considering the correlation between different components, the sparse representation coefficient vector of each task can be obtained by optimizing the objective function in formula (3):where stores the sparse representation coefficient vector corresponding to each component. In fact, multiple feature vectors from the same sample are related to a certain extent, so the sparse representation coefficient matrix has certain structural constraints, which can be expressed by the following formula:

The objective function in formula (4) adopts 12/ll norm to constrain . In this case, to obtain a smaller objective function value, the sparse representation coefficients under different feature components are required to have similar nonzero element distributions, which reflects the correlation between different components.

According to the obtained sparse representation coefficient matrix, the total reconstruction error of all feature components in different training categories is calculated according to formula (5). Finally, the target category of the test sample is determined according to the training category that can produce the smallest reconstruction error.

2.2.2. Target Recognition Process

Based on the above analysis, this paper designs the SAR target recognition framework shown in Figure 2, which can be summarized in the following key steps:(1)Perform NSST decomposition of the training samples to obtain multilevel child images and build independent dictionaries respectively(2)Use the process to decompose the test sample by NSST to obtain the corresponding multilevel progeny images(3)Jointly represent the multilevel progeny images of the test sample based on the joint sparse representation(4)According to formula (5), the overall reconstruction error of each training category for the test sample is calculated and the target category is determined

In specific implementation, considering the recognition accuracy and efficiency comprehensively, the four offspring images pointed out in the literature [21] are decomposed into the subsequent joint sparse representation. The first offspring image is a low-pass component, reflecting the overall information of the target. For all offspring images, the random projection dimension reduction method in [13] is used to obtain a 520-dimensional feature vector.

3. Results

3.1. Experimental Big Data Set

The MSTAR public data set is used to test the performance of the method proposed in this paper. This big data set uses X-band airborne SAR sensors to collect high-resolution (0.3 m) SAR images of ten types of ground military vehicle targets. It is currently an important data set for validating SAR target recognition algorithms. Table 1 lists the specific categories of these ten targets and typical experimental settings under standard operating conditions (SOC). Among them, the training set is collected from an elevation angle of 17°; and the test set is collected from an elevation angle of 15°. Due to the diversity of SAR image acquisition conditions in the MSTAR data set, a variety of experimental conditions can also be set based on them, such as model differences and pitch angle differences. During the experiment, several types of existing SAR target recognition methods were selected for comparison, including the SVM-based method in [12]; the SRC-based method in [13], and the CNN designed in [14].

4. Implications

4.1. Standard Operating Conditions

The recognition performance of the proposed method is tested under standard operating conditions based on the experimental settings in Table 1. The specific results show the confusion matrix shown in Figure 3. Among them, the elements on the diagonal reflect the correct recognition rate of the corresponding target under the current conditions. The remaining elements are the probability of misidentification as different targets. All types of targets can be correctly classified with a recognition rate of over 98%. Through the same test of various comparison algorithms, the average recognition rate of each method is obtained as listed in Table 2. The proposed method tops the list with a recognition rate of 99.14%, which fully demonstrates its effectiveness. The CNN method can also achieve a high recognition rate under standard operating conditions, mainly because the classification network trained under sufficient training samples has good adaptability to the test samples.

4.2. Model Difference

The recognition difficulty caused by the difference of the same target model has attracted wide attention in SAR target recognition. Table 3 shows a typical experimental setup under different model conditions, including three types of targets: BMP2, BTR70, and T72. It can be seen from the table that the test samples and training samples of the three types of targets come from completely different models. It is of great significance to realize the correct identification of other models through the study of typical models. Table 4 shows the average recognition rate of different methods under different models. Compared with standard operating conditions, the recognition performance of various methods under current conditions has declined to varying degrees. In contrast, the recognition rate of this method has the lowest decline, so it still maintains the best recognition results. The most obvious decline in the average recognition rate of the CNN method is mainly due to the poor adaptability of the network trained by a single model to other models.

4.3. Pitch Angle Difference

When the radar works with two different pitch angles, the two SAR images of the same target obtained by it will have a big difference. At this time, the difficulty of target recognition is significantly increased. Table 5 shows a typical experimental setup under the condition of pitch angle difference, including three types of targets: 2S1, BDRM2, and ZSU23/4. The training sample comes from a 17° pitch angle and the test sample comes from 30° and 45° pitch angles, respectively, so there is a large pitch angle difference between the test and training samples. Figure 4 shows the average recognition rate of various methods at different pitch angles. The method in this paper achieves better performance than other methods at both angles of 30° and 45°, which demonstrates its robustness to pitch angle differences.

4.4. Noise Interference

There is a lot of noise in SAR images, which makes some target characteristics not well reflected. The signal-to-noise ratio (SNR) of the original MSTAR data SAR image is relatively high, which cannot fully reflect the situation under actual reconnaissance conditions. For this reason, this experiment adds different degrees of noise to the original ten types of target test samples in Table 1 by means of simulation, and then obtains the average recognition rate of different methods under different noise levels, as shown Figure 5. The aggravation of noise interference makes the performance of various methods have a significant decline. In contrast, the method in this paper can maintain stronger robustness under noise interference conditions. Especially under the condition of low signal-to-noise ratio, the performance advantage of this method is more obvious.

5. Conclusions

This paper proposes an SAR image target recognition method based on NSST feature extraction. This method uses NSST to decompose the original image to obtain multiple progeny images. The proposed method progeny images not only reflect the main characteristics of the original image, but also reflect the local details of the target. Therefore, the joint multilevel NSST progeny decomposition structure can provide more sufficient information for correct target recognition. In the classification stage, joint sparse representation is used to jointly characterize the four levels of offspring images, and the target category of the test sample is determined according to the overall reconstruction error. Validation experiments were carried out based on the MSTAR data set. The analysis of experimental results shows that the method can maintain excellent performance under standard operating conditions, model differences, pitch angle differences, and noise interference conditions. As shown in the implementation section, the average recognition rates for the proposed method have outperformed in the pitch angle differences, and noise interference conditions significantly.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.