Deep support vector machine for hyperspectral image classification

https://doi.org/10.1016/j.patcog.2020.107298Get rights and content

Highlights

Abstract

To improve on the robustness of traditional machine learning approaches, emphasis has recently shifted to the integration of such methods with Deep Learning techniques. However, the classification problems, complexity and inconsistency in several spectral classifiers developed for hyperspectral images are some reasons warranting further research. This study investigates the application of Deep Support Vector Machine (DSVM) for hyperspectral image classification. Two hyperspectral images, Indian Pines and University of Pavia are used as tentative test beds for the experiment. The DSVM is implemented with four kernel functions: Exponential Radial Basis Function (ERBF), Gaussian Radial Basis Function (GRBF), neural and polynomial. Stand-alone Support Vector Machines form the interconnecting weights of the entire network. The network is trained with one hundred input datasets, and the interconnecting weights of the network are initialised using the regularisation parameter of the model. Numerical results show that the classification accuracies of the DSVM for Indian Pines and University of Pavia based on each DSVM kernel functions are: ERBF (98.87%, 98.16%), GRBF (98.90%, 98.47%), neural (98.41%, 97.27%), and polynomial (99.24%, 98.79%). By comparing the DSVM algorithm against well-known classifiers, Support Vector Machine (SVM), Deep Neural Network (DNN), Gaussian Mixture Model (GMM), K Nearest Neighbour (KNN), and K Means (KM) classifiers, the mean classification accuracies for Indian Pines and University of Pavia are: DSVM (98.86%, 98.17%), SVM (76.03%, 73.52%), DNN (94.45%, 93.79%), GMM (76.82%, 78.35%), KNN (76.87%, 78.80%), and KM (21.65%, 18.18%). These results indicate that the DSVM outperformed the other classification algorithms. The high accuracy obtained with the DSVM validates its efficacy as state-of-the-art algorithm for hyperspectral image classification.

Introduction

Hyperspectral Image (HSI) has been found invaluable because of its numerous applications and ability to obtain remotely sensed information from the visible through the near infrared wavelength ranges thus providing multi-spectral channels from the same location (e.g. [10], [27]). HSIs are highly innovative remote sensing imageries that consist of hundreds of contiguous narrow spectral bands, which unlike the conventional panchromatic and multispectral imageries enable a better distinct discrimination of object classes [34]. However, the major challenge for scientists is how to efficiently classify HSIs (see, e.g., [17]). Some of these challenges as detailed by Ghamisi et al. [10] include increased presence of redundant spectral information and high dimensionality in observed data, among others. Several conventional unsupervised and supervised machine learning classifiers have been used for classifying HSIs and includes prominent unsupervised conventional classifiers such as Fuzzy C-Means (FCM) and K Means (KM). While notable conventional supervised classifiers (e.g., K Nearest Neighbour (KNN) and Gaussian Mixture Model (GMM)) have been used in the classification of HSIs, the use of contemporary classifiers such as Support Vector Machine (SVM) and Artificial Neural Network (ANN) are gradually emerging [20], [31].

But recently, emphasis has shifted from conventional methods to the integration of Deep Learning (DL) and ANN, which scientists have argued grossly enhanced the robustness of the traditional ANN. For example, Paoletti et al. [28] showed that the use of DL to train the conventional ANN significantly increased the efficiency of the ANN for HSI classification. Haut et al. [11] implemented an integrated Deep Convolutional Neural Network (DCNN) using a new Bayesian approach and found that the hybrid of DL and the traditional ANN classifier enhanced the efficiency of the traditional ANN classifier. Furthermore, a novel guided filter based Deep Recurrent Neural Network (DRNN) for HSI classification has proved to be more efficient than the traditional ANN classifier [8]. Zhao et al. [35] recently proposed an integrated Convolutional Neural Network (CNN) and Gray Level Co-occurrence Matrix (GLCM) textural features for HSI classification using limited training sample. More recently (e.g., [19]), other DL ANN algorithms such as the use of Deep Belief Network have been employed in HSI classification with major highlights on their merits as opposed to conventional methods. However, classification problems associated with small-size training dataset in traditional machine learning techniques have been reported (e.g., [4]).

Recent advances in convolutional neural networks, including activation function, loss function, regularization, optimization and fast computation have been documented. Gu et al. [7] who provided details on these advances also highlighted the weakness of CNN, indicating computational efficiency and the choice of a suitable hyper-parameters (e.g., learning rate, kernel sizes of convolutional filters) are still challenging issues, especially for large-scale data. The use of a new CNN architecture for the classification of hyperspectral images was therefore predicated on the computational constraints of CNN algorithms to high-dimensional data contained in multidimensional data cubes [28]. Although the application of deep learning techniques, especially CNN in image-based cancer detection, diagnosis and other disciplines have shown significant strength [13], [18], improvements are required to handle large-scale multi-resolution data cubes. It is against this background, that assessing the skills of other non-parametric deep learning algorithms such as the deep SVM has become necessary.

In addition to other classification methods such as random forests, neural networks, and logistic regression-based techniques, the SVM [5] is another robust classifier that has been used in hyperspectral data classification (e.g., [2], [10]). Since the introduction of the SVM it has proven to be very efficient in remote sensing (RS) image classification, tide analysis, and prediction of urban land use change (e.g., [24]). Due to SVM's ability to model complex real-world data, they were found to be relatively better predictive models as opposed to agent-based models whose inability to evaluate model constraints and results have been highlighted in previous studies (e.g., [23], [25], [29], [30], [36]). Even though new algorithms such as the Supervised Fuzzy Partitioning are now competitive with the SVM, the latter is still largely characterized as a state-of-the-art algorithm [1]. SVM is intrinsically a binary classifier, it can be modified however, for multi-class problems by using mainly the One Against All (OAA) or One Against One (OAO) technique [14]. The OAA and OAO techniques have proven to be considerably effective in the classification of remote sensing images [26].

While classification problems still exist with traditional machine learning techniques, the complexity (e.g., availability of training samples) surrounding the implementation of different classification algorithms are some reasons warranting further research on ideal techniques for HSI. This was echoed in Ghamisi et al. [10] who noted the inconsistency in several spectral classifiers developed for HSI based on selected metrics. Furthermore, the use of deep belief network in improving classification outputs of hyperspectral images and multi-temporal images has recently been demonstrated [15], [19]. Whereas several parametric and prominent non-parametric algorithms have been widely used in image classification (see, e.g., [10], [20], [31]), the assessment and accuracy of HSI classification based on Deep Support Vector Machine (DSVM) however, is largely undocumented. One of the key challenges with HSI classification is limited training samples. It is for this reason that the development of new optimization algorithms such as deep learning is increasingly becoming popular and effective in the fields of image recognition and classification, especially for the classification of large multi-spectral and hyperspectral datasets [28]. Moreover, the success of these new algorithms in automatic feature extraction, computer vision, language processing and speech recognition have recently been re-echoed [17]. There are still some constraints nonetheless, on the application of these methods to multispectral and hyperspectral images. A regularized ensemble framework of deep learning that incorporates SVM is therefore crucial to further explore these challenges.

Arguably, deep learning algorithms have attracted the attention of the remote sensing community and several other experts in the fields of speech recognition, computer vision, and natural language processing among others. To explore the application of deep learning methods in hyperspectral image classification, a multi-grained network that appears to be an ensemble deep learning method has been proposed [27]. Another hybrid model that integrates unsupervised deep belief network with a one-class SVM were found to be scalable and computationally efficient in an earlier study [6]. While theoretical foundations and optimization techniques for learning deep CNN architectures are required [7], ensemble deep learning methods have shown considerable potentials in the classification of HSI classification. For example, the coupling of deep belief network with a SVM algorithm addressed complexity and scalability issues of SVM in large datasets [6]. The use of hybrid models is therefore emerging as efficient, accurate and scalable techniques that can improve the classification accuracy of large-scale and high-dimensional data.

The aim of this research therefore is to integrate DL and SVM to formulate a hybrid DSVM. The architecture of the DSVM proposed for this experiment imitates the Deep Neural Network (DNN) that consists of multiple hidden layers. Normally the hidden layer neurons are connected by series of weights; but instead the weights are initialised by several SVM functions modified with the SVM regularisation parameter. The optimal DSVM output for each input is found by updating all the connecting SVM functions in the hidden layer. Two HSIs, Indian Pines and University of Pavia are used as experimental and tentative test beds for the study. The OAA multi-class technique are used to modify the SVM for multi-class separation. To assess the robustness of our hybrid DSVM model, the results of the DSVM are compared to those of the SVM, DNN, GMM, KNN, and KM.

Section snippets

Deep support vector machine framework

SVM classifies a binary problem using a linear hyperplane by assuming that the training set has n-training samples, that is, (x1, y1), (x2, y2), ..., (xn, yn), where xi ∈ ℜN is an N dimensional vector that belongs to one of classes yi{1,+1} [9]. The stated binary classification problem can be separated using a linear decision function,f(x)=w·x+bwhere w ∈ ℜN is a vector that determines the orientation of the desired hyperplane required for the separation, and b ∈ ℜ is called the “bias.” The

Results

The DSVM presented in this study was designed to mimic the operation of the DNN. The individual SVMs that is, f(x) were made to function as interconnecting weights of the network. The resulting outputs were compared against the target output F(X) based on the backpropagation technique. One hundred distinct inputs X1X2, …, X100were used to obtain one hundred distinct outputs F1(X), F2(X), …, F100(X). The regularisation parameter was used to initialise the network in order to ensure that the

Discussion and conclusion

Imbalanced, multi-class learning problems have recently been addressed by Yuan, et al. [33] who used a regularized ensemble framework of DL methods. They showed that DL algorithms are capable of handling multi-class data sets because of the regularization parameter. Although several other sophisticated algorithms have been used to address similar problems, especially in image-based cancer detection and diagnosis [13], reduced computational cost and efficiency of ensemble-based approaches have

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Dr Onuwa Okwuashi is an expert in machine learning applications and remote sensing of the environment. He obtained his Ph.D. in 2011 at the Victoria University of Wellington, New Zealand. He is currently a senior lecturer at the University of Uyo, Nigeria where he teaches analytical methods in geospatial science and machine learning applications. Apart from his University teaching experience, He is also a senior consultant in the University of Uyo Geoinformatics Consult.

References (36)

  • B. Pan et al.

    MugNet: deep learning for hyperspectral image classification using limited samples

    ISPRS J. Photogramm. Remote Sens.

    (2018)
  • M.E. Paoletti et al.

    A new deep convolutional neural network for fast hyperspectral image classification

    ISPRS J. Photogramm. Remote Sens.

    (2018)
  • J. Xu et al.

    Multi-model ensemble with rich spatial information for object detection

    Pattern Recognit.

    (2020)
  • X. Yuan et al.

    A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data

    Pattern Recognit.

    (2018)
  • L. Zhao et al.

    LandSys: an agent-based cellular automata model of land use change developed for transportation analysis

    J. Transp. Geogr.

    (2012)
  • P. Ashtari et al.

    Supervised fuzzy partitioning

    Pattern Recognit.

    (2020)
  • B. Bigdeli et al.

    A multiple SVM system for classification of hyperspectral remote sensing data

    J. Indian Soc. Remote Sens.

    (2013)
  • W. Chen et al.

    A comparative study of landslide susceptibility maps produced using support vector machine with different kernel functions and entropy data mining models in China

    Bull. Eng. Geol. Environ.

    (2018)
  • Cited by (174)

    View all citing articles on Scopus

    Dr Onuwa Okwuashi is an expert in machine learning applications and remote sensing of the environment. He obtained his Ph.D. in 2011 at the Victoria University of Wellington, New Zealand. He is currently a senior lecturer at the University of Uyo, Nigeria where he teaches analytical methods in geospatial science and machine learning applications. Apart from his University teaching experience, He is also a senior consultant in the University of Uyo Geoinformatics Consult.

    Dr Christopher Ndehedehe is an expert in remote sensing hydrology and environmental geoinformatics. His Ph.D. thesis was awarded Curtin University Chancellor's Commendation for Exceptional Higher Degree by Research. Christopher won the 2018 D. B. Johnston Award for Excellence in the Spatial Sciences area, Curtin University, Australia where he obtained his Ph.D. in 2017. Christopher is a key scientist, providing leadership and training in the remote sensing components of several projects in Australian Rivers Institute funded by the National Environmental Science Programme. Christopher is currently a Research Fellow at the Australian Rivers Institute, Griffith University where he is driving innovative research directions in remote sensing of the environment and applications of advanced multivariate techniques to assess impacts of climate change on groundwater and ecological resources.

    View full text