Target spectrum based feature selection (TSFS): A new method based on chain coding for target detection problems
Introduction
In the last decade, hyperspectral imagery (HSI) obtained by remote sensing systems has provided high spectral resolution information over a wide range of the electromagnetic spectrum with hundreds of spectral bands [1], [2]. The large number of spectral bands in hyperspectral data makes some challenges to process them. Some of the main challenges for processing these high dimensional data are low efficiency, poor performance, and curse of dimensionality (COD) [3]. These problems have been discussed in details in [4].
Dimensionality reduction has been suggested to overcome aforementioned problems. Dimensionality reduction can be categorized into: (i) feature extraction, and (ii) feature selection [5]. Feature extraction aims to create some features from original features to form a new lower dimension space, while Feature selection aims to select high informative features to form a selective space. Compared with feature extraction, which may create new space using constructed features, feature selection does not change the original features [6], [7].
The feature selection methods are typically categorized in three approaches: Filter, Wrapper, and Embedded, based on how they represent the selection algorithm and the model structure [8], [9]. The filtering approach uses feature ranking techniques via the required parameter for feature selection. A suitable parameter is used to score the features and a threshold is used to eliminate features below the threshold. The main property of each feature is to provide opportunity to distinguish targets out of a background in the data [10], [11]. In the wrapper approach, the learning classifier is considered as a black box and the method lends itself to the use off-the-shelf machine learning. The wrapper approach ranks the features based on the selected classifier for setting the black box operation. This approach may work better than the filter if the classifier has been chosen correctly [8], [12]. The wrapper's main problem is lack of monitoring on the operations. Embedded feature selection methods are similar to wrapper methods since they are also used to optimize the performance of a learning algorithm. Embedded approach tends to reduce the computation time taken up for reclassifying subsets. The main embedded methods superiority to wrapper methods is that an intrinsic model building metric is used during learning, so embedded methods deal with the wrapper's lack of monitoring problem [8], [9].
Over the past decade, a large number of feature selection methods have been proposed. Li et al. in 2018 have categorized feature selection methods into eight groups [13]. We have complemented two wavelength based and relief based [14] methods to his categorizations that have been shown in Fig. 1. In fact, these methods have brought about a controversy whether the feature selection is a necessary preprocessing step for target detection. Information-theory-based methods are a large family of existing feature selection methods. These methods develop different heuristic filter criteria to assess the importance of features [13]. This family has been developed to maximize feature relevance and minimize feature redundancy [15]. Another family of feature selection methods is wavelength based methods. Wavelength based feature selection methods have been developed to pick features based on their wavelengths information when their wavelengths really matter. Some of these methods which have been called wavelength-based learning methods, are partial least squares regression [16], [17], stepwise regression [18], and 2-dimensional singular spectrum analysis [19] and also some sophisticated methods, such as successive projections algorithm [20], and simulated annealing [21], [22]. However, all the mentioned methods have a high computational burden and some of them are iterative methods. Furthermore, wavelength based methods only concentrate on wavelength information, while target detection basis is target spectral signature. It means that not only wavelengths information, but also target spectrum fluctuations and relationships between contiguous bands are important for detecting a target in the scene. Sparse learning based methods face to these problems, too. The sparse learning based methods intend to minimize the fitting errors using sparse regularization terms. These methods involve solving a non-smooth optimization problem, and in many cases complex matrix operations. For this reason, the high computational burden and cost are other restrictions of these methods [13].
Applying feature selection is a controversial issue in some of hyperspectral applications because it may decrease accuracy by eliminating distinctive features. The distinctive features might be eliminated because they probably are not the principal components or high variance bands. In fact, there is a controversy about do we allow to use the feature selection pre-processing for every hyperspectral application? Previous related researches can be divided into 2 subgroups, including opponents and proponent.
Opponent researchers have argued that feature selection is not proper in any field of hyperspectral image analysis. They believe feature selection decreases accuracy in target detection and un-mixing applications. Actually, their priority is the accuracy even with sacrificing the time, cost, computational burden and complexity. Li et al. have mentioned that accuracy decrements in target detection is because of feature selection. They showed that every single band elimination leads to accuracy decrement [23]. Geng et al. have been shown that not only feature selection will not be beneficial, but also adding any independent band of the original image, even a noisy band, will improve the performance of target detection [24]. Luyan et al. have been pointed that target detection needs to the full of target spectrum signature. So they have used some methods for multispectral band generation in order to increase target detection accuracy [25]. Gross et al. have been concluded that feature selection before target detection causes accuracy decrements due to the effect of background materials on maintained bands [26].
The researchers on the second side of the controversy are proponent researchers who use feature selection in each hyperspectral application. They have accepted a small amount of decrements in accuracy rather handling large amount of data with its associated problems. Their priority is managing time, cost, optimized process, and Hough phenomena [27], [28], [29], [30]. Furthermore, they have argued that for reducing the computational burden and maximizing the algorithms' performance, the feature selection methods have been a necessary preprocessing step that may allow us to convey the small amount of accuracy decrements [30]. Chen et al. explained that although feature selection using PCA and MNF causes accuracy decrements, selecting 1/2 or 1/3 of original dimensional doesn't severely affect the algorithm performance [31]. Beside this subgroup, there are some researches that use feature selection in their detection issues in various wavelength based approaches [16], [21], e.g. relief based [9], and sparse learning [32] methods. But wavelength based approaches have two main problems. First, these methods are iterative. So they have a high computational burden and complexity. Second, these methods just focus on wavelengths placement whereas in some of the hyperspectral applications the wavelength number or band placement is not the only issue. As a matter of fact, that is the spectral signature that matters. These applications need to analyze the relationship between contiguous bands and also the relationship between the distinctive bands. In this paper, we propose a new feature selection method based on target spectrum, in order to deal with the accuracy decrement, high computational burden, and maintaining distinctive features with respect to relationship between contiguous bands.
The proposed feature selection method is based on Chain coding idea. Chain coding has low computations and we apply it as the idea of feature selection method. In this paper, three feature selection methods including chain-filtering, chain-statistic, and chain-encoding have been developed for hyperspectral target detection based on chain coding idea. Here we have chosen target detection field because it is one of the hyperspectral image applications that provokes the mentioned controversy. Target detection is one of the most important applications of hyperspectral data because of both civil and military purposes [33]. Based on the chosen application, this paper aims to find the distinctive features for separating the target off the image.
The remainder of this paper is organized as follows. Section 2 provides the method's theoretical background, including the chain code algorithm and its application in digitizing the spectral signature and also details of the proposed method via particularizing three proposed feature selection methods in the next part. Section 3 introduces three hyperspectral datasets. Implementation and experimental results have been presented in this section, too. 4 Discussion, 5 Conclusion conclude the paper with a discussion and conclusions, respectively.
Section snippets
Methodology
The accuracy decrement in some of the hyperspectral image applications ─ that the researchers on the first side of controversy complain about ─ is due to eliminate the discriminate bands through feature selection processing. We must recognize the bands that keep the spectrum unique and separable from the background in the hyperspectral image. Also, easiness and computational burden must be considered to make the new method superior than the rest of feature selection methods.
In this paper, three
Datasets
In this section, the proposed TSFS approaches have been examined. Then, TSFS has been compared to PCA and MNF methods using three different datasets. First of all is an AVIRIS image of a cropped region of Cuprite Nevada dataset [41]. The classes of this region are geological materials including Kaolinite, Alunite, Andradite, Buddingtonite, Dumortierite, and Muscovite which have been used as targets in this paper (Fig. 6). The second dataset belongs to the Jasper Ridge region [41]. Each pixel is
Discussion
The data volume decrement shouldn't cause the accuracy decrement. For target detection problems, feature selection causes accuracy decrement using common information-theory-based, similarity based, sparse learning based methods. These methods make decrements in accuracy because of missing some of the distinctive features. Some researchers believe even a few accuracy decrements don't matter so much, while confronting COD is the first priority. The second group of researchers believe all the
Conclusion
This paper intended to achieve a feature selection method based on the target spectrum, which is appropriate for some of hyperspectral applications which are affected by spectral signature. The target detection method that we used in this article is the CEM method. CEM is one of the best and the most accurate target detection methods. This method needs to target spectrum signature as the only input. Therefore, spectrum signature of the target is very important here and also this importance
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (62)
- et al.
Multi-temporal hyperspectral mixture analysis and feature selection for invasive species mapping in rainforests
Remote Sens. Environ.
(2013) - et al.
A survey on feature selection methods
Comput. Electr. Eng.
(2014) - et al.
Wrappers for feature subset selection
Artif. Intell.
(1997) Feature selection: A data perspective
ACM Comput. Surv. (CSUR)
(2018)Estimation of green grass/herb biomass from airborne hyperspectral imagery using spectral indices and partial least squares regression
Int. J. Appl. Earth Obs. Geoinf.
(2007)- et al.
Stepwise regression data envelopment analysis for variable reduction
Appl. Math. Comput.
(2015) Hyperspectral reflectance imaging combined with chemometrics and successive projections algorithm for chilling injury classification in peaches
LWT-Food Sci. Technol.
(2017)- et al.
Deformation and fault parameters of the 2005 Qeshm earthquake in Iran revisited: A Bayesian simulated annealing approach applied to the inversion of space geodetic data
Int. J. Appl. Earth Obs. Geoinf.
(2014) An effective feature selection method for hyperspectral image classification based on genetic algorithm and support vector machine
Knowl.-Based Syst.
(2011)- et al.
Partial eigenvalue decomposition for large image sets using run-length encoding
Pattern Recogn.
(1995)