Geoscience Frontiers ( IF 4.202 ) Pub Date : 2020-09-15 , DOI: 10.1016/j.gsf.2020.09.002 Husam.A.H. Al-Najjar; Biswajeet Pradhan
In recent years, landslide susceptibility mapping has substantially improved with advances in machine learning. However, there are still challenges remain in landslide mapping due to the availability of limited inventory data. In this paper, a novel method that improves the performance of machine learning techniques is presented. The proposed method creates synthetic inventory data using Generative Adversarial Networks (GANs) for improving the prediction of landslides. In this research, landslide inventory data of 156 landslide locations were identified in Cameron Highlands, Malaysia, taken from previous projects the authors worked on. Elevation, slope, aspect, plan curvature, profile curvature, total curvature, lithology, land use and land cover (LULC), distance to the road, distance to the river, stream power index (SPI), sediment transport index (STI), terrain roughness index (TRI), topographic wetness index (TWI) and vegetation density are geo-environmental factors considered in this study based on suggestions from previous works on Cameron Highlands. To show the capability of GANs in improving landslide prediction models, this study tests the proposed GAN model with benchmark models namely Artificial Neural Network (ANN), Support Vector Machine (SVM), Decision Trees (DT), Random Forest (RF) and Bagging ensemble models with ANN and SVM models. These models were validated using the area under the receiver operating characteristic curve (AUROC). The DT, RF, SVM, ANN and Bagging ensemble could achieve the AUROC values of (0.90, 0.94, 0.86, 0.69 and 0.82) for the training; and the AUROC of (0.76, 0.81, 0.85, 0.72 and 0.75) for the test, subsequently. When using additional samples, the same models achieved the AUROC values of (0.92, 0.94, 0.88, 0.75 and 0.84) for the training and (0.78, 0.82, 0.82, 0.78 and 0.80) for the test, respectively. Using the additional samples improved the test accuracy of all the models except SVM. As a result, in data-scarce environments, this research showed that utilizing GANs to generate supplementary samples is promising because it can improve the predictive capability of common landslide prediction models.