Abstract
Debris flows destroys the facilities and seriously threatens human lives, especially in mountainous area. Susceptibility mapping is the key for hazard prevention. The aim of the present study is to compare the performance of three methods including Bayes discriminant analysis (BDA), logistic regression (LR) and random forest (RF) for debris flow susceptibility mapping from three aspects: applicability, analyticity and accuracy. Nyalam county, a debris flow-prone area, located in Southern Tibet, was selected as the study area. Firstly, the dataset containing 49 debris flow inventories and 16 conditioning factors was prepared. Subsequently, divided the dataset into two groups with a ratio of 70/30 for training and validation purposes, and repeated 5 times to obtain 5 different groups. Then, 16 factors were involved in the modeling of RF, of which 11 factors with low linear correlation were for BDA and LR. Finally, receiver operating characteristic curves, the area under curve (AUC) and contingency tables were applied to evaluated the accuracy performance of the 3 models. The prediction rates were 74.6–81.8%, 74.6–83.6% and 80–92.7%, for the BDA, LR and FR, while the AUC values of three models were 0.72–0.78, 0.82–0.92 and 0.90–0.99, respectively. Compare to LR an BDA, RF not only effectively process and preserved dataset without priori assumption and the obtained susceptibility zoning map and major factors were reasonable. The conclusion of the current study is useful for risk mitigation and land use planning in the study area and provide related references to other researches.
Similar content being viewed by others
Abbreviations
- BDA:
-
Bayes discriminant analysis
- LR:
-
Logistic regression
- RF:
-
Random forest
- ROC:
-
Relative operating characteristic
- AUC:
-
Area under curve
- DFS:
-
Debris flow susceptibility
- GPS:
-
Global positioning systems
- GIS:
-
Geographic information systems
- RS:
-
Remote sensing
- DEM:
-
Digital elevation model
- SRTM:
-
Shuttle Radar Topography Mission
- Sig:
-
Significant parameter value
- OOB:
-
Out of bag
- AGMC:
-
Average gradient of main channel
- MED:
-
Maximum elevation difference
- MODIS:
-
Moderate-resolution Imaging Spectroradiometer
- ASA:
-
Average slope angle
- RED:
-
Relative cutting depth
- FL:
-
Fault length
- FD:
-
Fault density
- DTF:
-
Distance to fault
- NDVI:
-
Normalized difference vegetation index
- MCL:
-
Main channel length
- DD:
-
Drainage density
- DTR:
-
Distance to road
- VIF:
-
Variance inflation factor
- FR:
-
Frequency ratio
- TP:
-
True positives
- TN:
-
True negatives
- FP:
-
False positives
- FN:
-
False negatives
References
Ayalew L, Yamagishi H (2005) The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology 65(1):15–31. https://doi.org/10.1016/j.geomorph.2004.06.010
Bai SB, Wang J, Guo NL, Zhou PG, Hou SS, Xu SN (2010) GIS-based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the Three Gorgesarea, China. Geomorphology 115:23–31. https://doi.org/10.1016/j.geomorph.09.025
Bhagat RC, Patil SS (2015) Enhanced SMOTE algorithm for classification of imbalanced big-data using random forest. In: Advance computing conference (IACC), 2015 IEEE international. https://doi.org/10.1109/iadcc.2015.7154739
Blais-Stevens A, Behnia P (2016) Debris flow susceptibility mapping using a qualitative heuristic method and flow-R along the Yukon Alaska Highway Corridor, Canada. Nat Hazard Earth Syst Sci 16(2):449–462. https://doi.org/10.5194/nhess-16-449-2016
Bonafiglia JT, Nelms MW, Preobrazenski N, Blanc CL, Robins L, Lu S, Lithopoulos A, Walsh JJ, Gurd BJ (2018) Moving beyond threshold-based dichotomous classification to improve the accu accuracy in classifying non-responders. Physiol Rep. https://doi.org/10.14814/phy2.13928
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Breiman L, Cutler A (2004) http://www.stat.berkeley.edu/users/Breiman/RandomForests/ccpapers.h-Tml
Bui DT, Pradhan B, Lofman O, Revhaug I, Dick OB (2012) Landslide susceptibility assessment in the Hoa Binh Province of Vietnam: a comparison of the Levenberg-Marquardt and Bayesian regularized neural networks. Geomorphology. https://doi.org/10.1016/j.geomorph.04.023
Calle ML, Urrea V (2010) Letter to the Editor: stability of random forest importance measures. Brief Bioinform 12(1):86–89. https://doi.org/10.1093/bib/bbq011
Carrara A, Crosta G, Frattini P (2008) Comparing models of debris-flow susceptibility in the alpine environment. Geomorphology 94:353–378. https://doi.org/10.1016/j.geomorph.2006.10.033
Chatterjee S, Hadi AS (2012) Regression analysis by example, 5th edn. Wiley, London
Chiou I-J, Chen C-H, Liu W-L, Huang S-M, Chang Y-M (2015) Methodology of disaster risk assessment for debris flows in a river basin. Stoch Environ Res Risk Assess 29:775–792. https://doi.org/10.1007/s00477-014-0932-1
Chung CF, Fabbri AG (2003) Validation of spatial prediction models for landslide hazard mapping. Nat Hazards 30:451–472. https://doi.org/10.1023/B:NHAZ.0000007172.62651.2b
Colkesen I, Sahin EK, Kavzoglu T (2016) Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression. Afr Earth Sci 118:53–64. https://doi.org/10.1016/j.jafrearsci.02.019
Cooper EW, Kamei K (2009) Borderline over-sampling for imbalanced data classification. Int J Knowl Eng Soft Data Paradig 3(1):4–21. https://doi.org/10.1504/IJKESDP.2011.039875
Cox DR, Snell EJ (1968) A general definition of residuals. J R Stat Soc Ser B (Methodol) 40(2):248–275. https://doi.org/10.1111/j.2517-6161.1968.tb00724.x
Das I, Sahoo S, Van Westen C, Stein A, Hack R (2010) Landslide susceptibility assessment using logistic regression and its comparison with a rock mass classification system, along a road section in the northern Himalayas (India). Geomorphology 114:627–637. https://doi.org/10.1016/j.geomorph.2009.09.023
Di BF, Chen NS, Cui P, Li ZL, He YP, Gao YC (2008) GIS-based risk analysis of debris flow: an application in Sichuan, southwest China. Int J Sedim Res. https://doi.org/10.1016/s1001-6279(08)60013-X
Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181. https://doi.org/10.1117/1.JRS.11.015020
Frattini P, Crosta GB, Carrara A (2010) Techniques for evaluating the performance of landslide susceptibility models. Eng Geol 111:62–72. https://doi.org/10.1016/j.enggeo.2009.12.004
Gorsevski PV, Gessler PE, Foltz RB (2006) Research article: spatial prediction of landslide hazard using logistic regression and ROC analysis. Trans GIS 10(3):395–415. https://doi.org/10.1111/j.1467-9671.2006.01004.x
Graf C, Stoffel M, Grêt-Regamey A (2009) Enhancing debris flow modeling parameters integrating Bayesian networks. Geophys Res Abstr 11:10725
Guzzetti F, Carrara A, Cardinali M, Reichenbach P (1999) Landslide hazard evaluation: a review of current techniques and their application in a multi-scale study, Central Italy. Geomorphology 31:181–216. https://doi.org/10.1016/s0169-555x(99)00078-1
Guzzetti F, Cardinali M, Reichenbach P, Carrara A (2000) Comparing landslide maps: a case study in the upper Tiber River Basin, Central Italy. Environ. Manag 25(3):247–363. https://doi.org/10.1007/s002679910020
Guzzetti F, Reichenbach P, Ardizzone F, Cardinali M, Galli M (2006a) Estimating the quality of landslide susceptibility models. Geomorphology 81:166–184. https://doi.org/10.1016/j.geomorph.206.04.007
Guzzetti F, Galli M, Reichenbach P, Ardizzone F, Cardinali M (2006b) Landslide hazard assessment in the Collazzone area, Umbria, central Italy. Nat Hazard Earth Syst Sci 6:115–131. https://doi.org/10.5194/nhess-6-115-2006
Hanushek E, Jackson J (1977) Statistical methods for social scientists. Academic Press, New York. https://doi.org/10.1037/14160-000
Hosmer DW, Lemeshow S (1989) Applied logistic regression. Wiley, New York
Kazmi D, Sadaf Qasim ISH, Harahap SB, Imran M, Moin S (2016) A study on the contributing factors of major landslides in Malaysia. Civ Eng J. https://doi.org/10.28991/cej-2016-00000066
Kazmi D, Qasim S, Harahap ISH, Baharom S, Mehmood M, Siddiqui FI, Imran M (2017) Slope remediation techniques and overview of landslide risk management. Civ Eng J. https://doi.org/10.28991/cej-2017-00000084
Kirschbaum DB, Adler R, Hong Y, Kumar S, Peters-Lidard C, Lerner-Lam A (2010) Advances in landslide nowcasting: evaluation of a global and regional modeling approach. Environ Earth Sci 1(4):118–134. https://doi.org/10.1007/s12665-011-0990-3
Lee S, Min K (2001) Statistical analysis of landslide susceptibility at Yongin, Korea. Environ Geol 40:1095–1113. https://doi.org/10.1007/s002540100310
Lee S, Talib JA (2005) Probabilistic landslide susceptibility and factor effect analysis. Environ Geol 47(7):982–990. https://doi.org/10.1007/s00254-005-1228-z
Li C, Tang H, Ge Y, Hu X, Wang L (2014) Application of back-propagation neural network on bank destruction forecasting for accumulative landslides in the three Gorges Reservoir Region, China. Stoch Env Res Risk Assess 28(6):1465–1477. https://doi.org/10.1007/s00477-014-0848-9
Ließ M, Glaser B, Huwe B (2012) Uncertainty in the spatial prediction of soil texture: comparison of regression tree and random Forest models. Geoderma 170:70–79. https://doi.org/10.1016/j.geoderma.2011.10.010
Liu XY, Wu J, Zhou ZH (2009) Exploratory Under-Sampling for Class-Imbalance Learning. IEEE Trans Syst Man Cybern Part B Cybern A Publ IEEE Syst Man Cybern Soc 39(2):539–550
Liu K, Wang M, CaoY Z, Yang GL (2018) Susceptibility of existing and planned Chinese railway system subjected to rainfall-induced multi-hazards. Transp Res Part A Policy Pract 117:214–226. https://doi.org/10.1016/j.tra.2018.08.030
Lopez JL, Perez D, Garcia R (2003) Hydrologic and geomorphologic evaluation of the 1999 debris flow event in Venezuela. In: 3rd international conference on debris flow hazard mitigation: mechanics, prediction, and assessment. Davos, Switzerland, pp 13–15
Melton MA (1965) The geomorphic and paleoclimatic significance of alluvial deposits in Southern Arizona: a reply. J Geol 73(1):102–106. https://doi.org/10.1086/627147
Mokhtari M, Abedian S (2019) Spatial prediction of landslide susceptibility in Taleghan basin, Iran. Stoch Environ Res Risk Assess 33(7):1297–1325. https://doi.org/10.1007/s00477-019-01696-w
Mukhiddin J, Martin M, Ismail M, Bakhtiar N, Alim P, Johannes H (2019) Comparative analysis of statistical methods for landslide susceptibility mapping in the Bostanlik District, Uzbekistan. Sci Total Environ. https://doi.org/10.1016/j.scitotenv.2018
Nagelkerke NJD (1991) A note on a general definition of the coefficient of determination. Biometrika 78(3):691–692. https://doi.org/10.1093/biomet/78.3.691
Nourani V, Pradhan B, Ghaffari H, Sharifi SS (2014) Landslide susceptibility mapping at Zonouz plain, Iran using genetic programming and comparison with frequency ratio, logistic regression, and artificial neural network models. Nat Hazards 71(1):523–547. https://doi.org/10.1007/s11069-013-0932-3
Nsengiyumva JB, Luo G, Amanambu AC, Mind’je R, Habiyaremye G, Karamage F, Ochege FU, Mupenzi C (2019) Comparing probabilistic and statistical methods in landslide susceptibility modeling in Rwanda/Centre-Eastern Africa. Sci Total Environ. https://doi.org/10.1016/j.scitotenv.2018.12.248
Pedregosa F, Varoquaux G, Gramfort A (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12(10):2825–2830
Pradhan B (2010) Landslide susceptibility mapping of a catchment area using frequency ratio, fuzzy logic and multivariate logistic regression approaches. J Indian Soc Remote Sens 38(2):301–320. https://doi.org/10.1007/s12524-010-0020-z
Pradhan B, Jebur MN (2017) Spatial prediction of landslide-prone areas through K-nearest neighbor algorithm and logistic regression model using high resolution airborne laser scanning data. In: Laser scanning applications in landslide assessment. Springer, Cham, pp 151–165. https://doi.org/10.1007/978-3-319-55342-9_8
Reichenbach P, Rossi M, Malamud BD, Mihir M, Guzzetti F (2018a) A review of statistically-based landslide susceptibility models. Earth Sci Rev 180(5):60–91. https://doi.org/10.1016/j.earscirev.2018.03.001
Reichenbach P, Rossi M, Malamud BD, Mihir M, Guzzetti F (2018b) A review of statistically-based landslide susceptibility models. Earth Sci Rev. https://doi.org/10.1016/j.earscirev.2018.03.001
Rupert MG, Cannon SH, Gartneretc JE (2008) Using logistic regression to predict the probability of debris flows in areas burned by wildfires, southern california, 2003–2006.U.S[R]. Geological Survey Open-File Report, 1–9. https://doi.org/10.3133/ofr03500
Shi MY, Chen JP, Song Y, Zhang W, Song SY, Zhang XD (2016) Assessing debris flow susceptibility in Heshigten Banner, Inner Mongolia, China, using principal component analysis and an improved fuzzy C -means algorithm. Bull Eng Geol Environ. https://doi.org/10.1007/s10064-015-0784-z
Swets JA (1988) Measuring the accuracy of diagnostic systems. Science 240:1285–1293. https://doi.org/10.1126/science.3287615
Tong LQ, Qi WS, An CL, Liu CL (2019) Remote sensing survey of major geological disasters in the Himalayas. J Eng Geol 27(03):496
Trigila A, Iadanza C, Esposito C, Scarascia-Mugnozza G (2015) Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy). Geomorphology. https://doi.org/10.1016/j.geomorph.2015.06.001
van Westen CJ, Castellanos E, Kuriakose SL (2008) Spatial data for landslide susceptibility, hazard, and vulnerability assessment: an overview. Eng Geol 102(3–4):112–131. https://doi.org/10.1016/j.enggeo.2008.03.010
Varnes DJ (1984) Landslide hazard zonation: a review of principles and practice. Commission on Landslides of the IAEG, UNESCONatural Hazards No. 3 (61 pp)
Vedala R, Kumar BR (2012) An application of naive Bayes classification for credit scoring in e-lending platform. In: International conference on data science and engineering (ICDSE). IEEE, pp 81–84. https://doi.org/10.1109/icdse.2012.6282321
Weiss GM, Provost F (2003) Learning when training data are costly: the effect of class distribution on tree induction. J Artif Intell Res 19:315–354. https://doi.org/10.1613/jair.1199
Xu WB, Jing SC, Yu WJ, Wang ZX, Zhang GP, Huang JX (2013a) A comparison between Bayes discriminant analysis and logistic regression for prediction of debris flow in southwest Sichuan, China. Geomorphology. https://doi.org/10.1016/j.geomorph.2013.06.003
Xu WB, Yu WJ, Jing SC, Zhang GP, Huang JX (2013b) Debris flow susceptibility assessment by GIS and information value model in a large-scale region, Sichuan Province (China). Nat Hazards 65(3):1379–1392. https://doi.org/10.1007/s11069-012-0414-z
Youssef AM, Pourghasemi HR, Pourtaghi ZS, Al-Katheeri MM (2016) Erratum to: landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides. https://doi.org/10.1007/s10346-015-0667-1
Zezere JL, Pereira S, Melo R, Oliveira SC, Garcia RAC (2017) Mapping landslide susceptibility using data-driven methods. Sci Total Environ 2017:589. https://doi.org/10.1016/j.scitotenv.2017.02.188
Zhang W, Chen J-P, Qin S-W, Zhang C, Li M, Ma J-Q (2010) Application of FCM based on principal components analysis in Debris flow classification. J Jilin Unv Earth Sci Ed 40(02):368–372
Zhang C, Wang Q, Chen J-P, Gu F-G, Zhang W (2011) Evaluation of debris flow risk in Jinsha River based on combined weight process. Rock Soil Mech 32(03):831–836. https://doi.org/10.16285/j.rsm.2011.03.019
Zheng GQ, Zhang HJ, Liu T, Wu JD, Hou XF, Ye ZH (2009) Prediction model of flush flood and debris flow in Miyun County based on Bayes discriminatory analysis. Bull Soil Water Conserv 29(1):83–87. https://doi.org/10.13961/j.cnki.stbctb.2009.01.011(in Chinese)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant Nos. 41572257 and 41972267).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liang, Z., Wang, CM., Zhang, ZM. et al. A comparison of statistical and machine learning methods for debris flow susceptibility mapping. Stoch Environ Res Risk Assess 34, 1887–1907 (2020). https://doi.org/10.1007/s00477-020-01851-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-020-01851-8