Skip to main content

Advertisement

Log in

Comparative landslide spatial research based on various sample sizes and ratios in Penang Island, Malaysia

  • Original Paper
  • Published:
Bulletin of Engineering Geology and the Environment Aims and scope Submit manuscript

Abstract

This paper aims to compare and develop the influence on different sample sizes and sample ratios when using machine learning (ML) models, i.e., support vector machine (SVM) and artificial neural network (ANN), to produce landslide susceptibility maps (LSMs) in Penang Island, Malaysia. At the same time, traditional statistical (TS) models are also considered to produce LSMs in this comparative research. The receiver operating characteristic (ROC) curve and recall metric are applied to evaluate the model’s performance. Based on the evaluation criteria, the ML model outperforms the TS models and the ML models trained using the datasets with larger sample size give a better performance. ML models, especially SVM models, have better performance when training with balanced datasets as well as the datasets of more landslide sample data. Kruskal-Wallis test and Mann-Whitney U test are applied to test the significance. The results indicate that sample size and sample ratio are essential factors when considering ML models to produce LSMs. The LSMs produced in this research can provide valid and useful information to the local authorities for landslide mitigation and prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Aditian A, Kubota T, Shinohara Y (2018) Comparison of GIS-based landslide susceptibility models using frequency ratio, logistic regression, and artificial neural network in a tertiary region of Ambon, Indonesia. Geomorphology 318:101–111

    Google Scholar 

  • Ahmad F, Yahaya AS, Farooqi MA (2006) Characterization and geotechnical properties of Penang residual soils with emphasis on landslides. Am J Environ Sci 2(4):121–128

    Google Scholar 

  • Akinci H, Doğan S, Kiliccedil C, Temiz MS (2011) Production of landslide susceptibility map of Samsun (Turkey) City center by using frequency ratio method. Int J Phys Sci 6(5):1015–1025

  • Al-Abadi AM, Al-Temmeme AA, Al-Ghanimy MA (2016) A GIS-based combining of frequency ratio and index of entropy approaches for mapping groundwater availability zones at Badra–Al Al-Gharbi–Teeb areas, Iraq Sustain Water Resour Manag 2(3):265–283

  • Arabameri A, Rezaei K, Cerdà A, Conoscenti C, Kalantari Z (2019) A comparison of statistical methods and multi-criteria decision making to map flood hazard susceptibility in northern Iran. Sci Total Environ 660:443–458

    Google Scholar 

  • Bai S, Wang J, Thiebes B, Cheng C, Yang Y (2014) Analysis of the relationship of landslide occurrence with rainfall a case study of Wudu County, China. Arab J Geosci 7(4):1277–1285

    Google Scholar 

  • Balamurugan G, Ramesh V, Touthang M (2016) Landslide susceptibility zonation mapping using frequency ratio and fuzzy gamma operator models in part of NH-39, Manipur, India. Nat Hazards 84(1):465–488

    Google Scholar 

  • Beguería S (2006) Validation and evaluation of predictive models in hazard assessment and risk management. Nat Hazards 37(3):315–329

    Google Scholar 

  • Bortoloti F, Junior RC, Araújo L, de Morais M (2015) Preliminary landslide susceptibility zonation using GIS-based fuzzy logic in Vitória, Brazil. Environ Earth Sci 74(3):2125–2141

    Google Scholar 

  • Bui DT, Pradhan B, Lofman O, Revhaug I (2012a) Landslide susceptibility assessment in Vietnam using support vector machines, decision tree, and naive Bayes models. Math Probl Eng. https://doi.org/10.1155/2012/974638

  • Bui DT, Pradhan B, Lofman O, Revhaug I, Dick OB (2012b) Spatial prediction of landslide hazards in Hoa Binh province (Vietnam) a comparative assessment of the efficacy of evidential belief functions and fuzzy logic models. Catena 96:28–40

    Google Scholar 

  • Bui DT, Tuan TA, Klempe H, Pradhan B, Revhaug I (2016) Spatial prediction models for shallow landslide hazards a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13(2):361–378

    Google Scholar 

  • Can A, Dagdelenler G, Ercanoglu M, Sonmez H (2019) Landslide susceptibility mapping at Ovacık-Karabük (Turkey) using different artificial neural network models comparison of training algorithms. Bull Eng Geol Environ 78(1):89–102

    Google Scholar 

  • Chang Z, Du Z, Zhang F, Huang F, Chen J, Li W, Guo Z (2020) Landslide susceptibility prediction based on remote sensing images and GIS: comparisons of supervised and unsupervised machine learning models. Remote Sens 12(3):502–523

    Google Scholar 

  • Chen W, Wang J, Xie X, Hong H, Van Trung N, Bui DT, Li X (2016) Spatial prediction of landslide susceptibility using integrated frequency ratio with entropy and support vector machines by different kernel functions. Environ Earth Sci 75(20):1344–1449

    Google Scholar 

  • Chen W, Zhang S, Li R, Shahabi H (2018) Performance evaluation of the GIS-based data mining techniques of best-first decision tree, random forest, and naïve Bayes tree for landslide susceptibility modeling. Sci Total Environ 644:1006–1018

    Google Scholar 

  • Choi J, Oh HJ, Lee HJ, Lee C, Lee S (2012) Combining landslide susceptibility maps obtained from frequency ratio, logistic regression, and artificial neural network models using ASTER images and GIS. Eng Geol 124:12–23

    Google Scholar 

  • Cox DR (1958) The regression analysis of binary sequences. J R Stat Soc Ser B Methodol 20(2):215–232

    Google Scholar 

  • Demir G, Aytekin M, Akgün A, Ikizler SB, Tatar O (2013) A comparison of landslide susceptibility mapping of the eastern part of the North Anatolian Fault Zone (Turkey) by likelihood-frequency ratio and analytic hierarchy process methods. Nat Hazards 65(3):1481–1506

    Google Scholar 

  • Efron B (1982) The jackknife, the bootstrap, and other resampling plans. Siam, Vol 38

  • Elkan C (2001) The foundations of cost-sensitive learning. International Joint Conference on Artificial Intelligence, Lawrence Erlbaum Associates Ltd.

  • Fawcett J (2005) Criteria for evaluation of theory. Nurs Sci Q 18(2):131–135

    Google Scholar 

  • Feizizadeh B, Roodposhti MS, Blaschke T, Aryal J (2017) Comparing GIS-based support vector machine kernel functions for landslide susceptibility mapping. Arab J Geosci 10(5):122–135

    Google Scholar 

  • Felicísimo ÁM, Cuartero A, Remondo J, Quirós E (2013) Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods a comparative study. Landslides 10(2):175–189

    Google Scholar 

  • Gao H, Fam PS, Low HC, Tay LT, Lateh H (2019) An overview and comparison on recent landslide susceptibility mapping methods. Disaster Adv 12(12):46–64

    Google Scholar 

  • Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feed forward neural networks. Paper presented at the Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics

  • Gong P (1996) Integrated analysis of spatial data for multiple sources: using evidential reasoning and artificial neural network techniques for geological mapping. Photogramm Eng Remote Sens 62(5):513–523

    Google Scholar 

  • Hong H, Pradhan B, Jebur MN, Bui DT, Xu C, Akgun A (2016) Spatial prediction of landslide hazard at the Luxi area (China) using support vector machines. Environ Earth Sci 75(1):40. https://doi.org/10.1007/s12665-015-4866-9

    Article  Google Scholar 

  • Huabin W, Gangjun L, Weiya X, Gonghui W (2005) GIS-based landslide hazard assessment an overview. Prog Phys Geogr 29(4):548–567

    Google Scholar 

  • Huang Y, Zhao L (2018) Review on landslide susceptibility mapping using support vector machines. Catena 165:520–529

    Google Scholar 

  • Jebur MN, Pradhan B, Tehrany MS (2014) Manifestation of LiDAR-derived parameters in the spatial prediction of landslides using novel ensemble evidential belief functions and support vector machine models in GIS. IEEE J Sel Topics Appl Earth Obs Remote Sens 8(2):674–690

    Google Scholar 

  • Kitutu MG, Muwanga A, Poesen J, Deckers S (2011) Farmer’s perception on landslide occurrences in Bududa District, eastern Uganda. Afr J Agric Res 6(1):7–18

    Google Scholar 

  • Klose M, Damm B, Terhorst B (2015) Landslide cost modeling for transportation infrastructures a methodological approach. Landslides 12(2):321–334

    Google Scholar 

  • LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks, 3361(10):1995–2009

  • Nguyen H, Moayedi H, Foong LK, Najjar HA, Jusoh WA, Rashid AS, Jamali J (2019) Optimizing ANN models with PSO for predicting short building seismic response. Eng Comput. https://doi.org/10.1007/s00366-019-00733-0

  • Nourani V, Pradhan B, Ghaffari H, Sharifi SS (2014) Landslide susceptibility mapping at Zonouz Plain, Iran using genetic programming and comparison with frequency ratio, logistic regression, and artificial neural network models. Nat Hazards 71(1):523–547

    Google Scholar 

  • O'Brien RM (2007) A caution regarding rules of thumb for variance inflation factors. Qual Quant 41(5):673–690

    Google Scholar 

  • Oh HJ, Lee S (2017) Shallow landslide susceptibility modeling using the data mining models artificial neural network and boosted tree. Appl. Sci 7(10):1000

  • Ong W (1980) Geology of Penang Island (sheet 28), Geol Survey Annual Report 178

  • Petley D (2012) Global patterns of loss of life from landslides. Geology 40(10):927–930

    Google Scholar 

  • Pham BT, Pradhan B, Bui TD, Prakash I, Dholakia MB (2016) A comparative study of different machine learning methods for landslide susceptibility assessment. Environ Model Softw 84:240–250

    Google Scholar 

  • Pham BT, Bui DT, Prakash I (2017) Landslide susceptibility assessment using bagging ensemble based alternating decision trees, logistic regression and J48 decision trees methods: a comparative study. Geotech Geol Eng 35(6):2597–2611

    Google Scholar 

  • Pham BT, Bui DT, Prakash I (2018) Bagging based support vector machines for spatial prediction of landslides. Environ Earth Sci 77(4):146. https://doi.org/10.1007/s12665-018-7268-y

    Article  Google Scholar 

  • Polemio M, Petrucci O (2000) Rainfall as a landslide triggering factor an overview of recent international research. Landslides in Research, Theory and Practice, Thomas Telford Ltd.

  • Polykretis C, Chalkias C (2018) Comparison and evaluation of landslide susceptibility maps obtained from weight of evidence, logistic regression, and artificial neural network models. Nat Hazards 93(3):1–26

    Google Scholar 

  • Pradhan B (2010) Landslide susceptibility mapping of a catchment area using frequency ratio, fuzzy logic and multivariate logistic regression approaches. J Indian Soc Remote Sens 38(2):301–320

  • Pradhan B, Saro L (2010a) Delineation of landslide hazard areas on Penang Island, Malaysia, by using frequency ratio, logistic regression, and artificial neural network models. Environ Earth Sci 60(5):1037–1054

    Google Scholar 

  • Pradhan B, Saro L (2010b) Landslide susceptibility assessment and factor effect analysis backpropagation artificial neural networks and their comparison with frequency ratio and bivariate logistic regression modelling. Environ Model Softw 25(6):747–759

    Google Scholar 

  • Rozos D, Bathrellos GD, Skillodimou HD (2011) Comparison of the implementation of rock engineering system and analytic hierarchy process methods, upon landslide susceptibility mapping, using GIS a case study from the Eastern Achaia County of Peloponnesus, Greece. Environ Earth Sci 63(1):49–63

    Google Scholar 

  • Sadr MP, Maghsoudi A, Saljoughi BS (2014) Landslide susceptibility mapping of Komroud sub-basin using fuzzy logic approach. Geodyn Res Int Bull 2(2):16–28

    Google Scholar 

  • Scaringi G, Fan X, Xu Q, Liu C, Ouyang C, Domènech G, Dai L (2018) Some considerations on the use of numerical methods to simulate past landslides and possible new failures the case of the recent Xinmo landslide (Sichuan, China). Landslides 15(1):1–17

    Google Scholar 

  • Süzen ML, Doyuran VA (2004) Comparison of the GIS based landslide susceptibility assessment methods multivariate versus bivariate. Environ Geol 45(5):665–679

    Google Scholar 

  • Tan BK (1994) Engineering properties of granitic soils and rocks of Penang Island, Malaysia. Geol Soi Malaysia 35:69–77

  • Tay LT, Lateh H, Hossain MK, Kamil AA (2014) Landslide hazard mapping using a Poisson distribution: A case study in Penang Island, Malaysia. Landslide Sci Safer Geoenviron 521–525

  • Tsangaratos P, Ilia I (2016) Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments the influence of models complexity and training dataset size. Catena 145:164–179

    Google Scholar 

  • Varnes DJ (1978) Slope movement types and processes, Special report, 17611–33

  • Wang Y, Fang Z, Hong H (2019) Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County, China. Sci Total Environ 666:975–993

    Google Scholar 

  • Warner RM (2008) Applied statistics: from bivariate through multivariate techniques. SAGE, Thousand Oaks

    Google Scholar 

  • Westen CJV, Rengers N, Soeters R (2003) Use of geomorphological information in indirect landslide susceptibility assessment. Nat Hazards 30(3):399–419

    Google Scholar 

  • Yao X, Tham LG, Dai FC (2008) Landslide susceptibility mapping based on support vector machine: a case study on natural slopes of Hong Kong, China. Geomorphology 101(4):572–582

    Google Scholar 

  • Yeon YK, Han JG, Ryu KH (2010) Landslide susceptibility mapping in Injae, Korea, using a decision tree. Eng Geol 116(3):274–283

    Google Scholar 

  • Zadeh LA (1965) Fuzzy sets. Inf Control 8(3):338–353

    Google Scholar 

  • Zhou ZH (2016) Machine learning. Tsinghua University press (Chinese), Beijing

Download references

Funding

This study was funded by RUI Grant (1001/PMATHS/8011093) from Universiti Sains Malaysia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pei Shan Fam.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, H., Fam, P.S., Tay, L.T. et al. Comparative landslide spatial research based on various sample sizes and ratios in Penang Island, Malaysia. Bull Eng Geol Environ 80, 851–872 (2021). https://doi.org/10.1007/s10064-020-01969-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10064-020-01969-7

Keywords

Navigation