当前位置: X-MOL 学术Geocarto Int. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparative analysis of gradient boosting algorithms for landslide susceptibility mapping
Geocarto International ( IF 3.3 ) Pub Date : 2020-10-16 , DOI: 10.1080/10106049.2020.1831623
Emrehan Kutlug Sahin 1
Affiliation  

Abstract

The aim of the study is to compare four recent gradient boosting algorithms named as Gradient Boosting Machine (GBM), Categorical Boosting (CatBoost), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM) for modelling landslide susceptibility (LS). In the first step of the study, the geodatabase including landslide inventory map and landslide conditioning factors was constructed. In the second step, chi-square (CHI) statistic-based feature selection (FS) technique was utilized to compute the importance of the landslide causative factors. In the third step, tree-based ensemble learning algorithms were applied to predict the potential distribution of landslide susceptibility. Also, the prediction performance of ensemble methods was compared to that of Random Forest (RF) ensemble method. Finally, the prediction capabilities of the methods were assessed using overall accuracy (Acc), area under the receiver operating characteristic curve (AUC), kappa index, root mean square error (RMSE), and F score measures. In order to further evaluation, the McNemar's test was utilized to assess statistical significance in the differences between the four gradient boosting models. The accuracy results indicated that the CatBoost model had the highest prediction capability (Acc= 0.8503 and AUC= 0.8975), followed by the XGBoost (Acc= 0.8336 and AUC= 0.8860), the LightGBM (Acc= 0.8244 and AUC= 0.8796) and the GBM (Acc= 0.8080 and AUC= 0.8685). On the other hand, the estimated accuracy measures considered in this study showed that the RF method had the lowest prediction capability of compared the others. Although the individual performances of the methods were found to be acceptable level, the CatBoost method showed the superior performance compared to others with respect to the AUC and Acc values estimated in this study. The results of the study confirmed that the relatively new ensemble learning techniques were efficient and robust for producing LS maps and furthermore, it is probably that these algorithms will be preferred more often in the future studies due to their robustness.



中文翻译:

滑坡敏感性测绘梯度提升算法对比分析

摘要

该研究的目的是比较四种最近的梯度提升算法,称为梯度提升机 (GBM)、分类提升 (CatBoost)、极端梯度提升 (XGBoost) 和轻梯度提升机 (LightGBM),用于模拟滑坡敏感性 (LS) . 在研究的第一步中,构建了包括滑坡清单图和滑坡调节因子的地理数据库。第二步,利用基于卡方(CHI)统计的特征选择(FS)技术来计算滑坡成因的重要性。第三步,应用基于树的集成学习算法预测滑坡敏感性的潜在分布。此外,将集成方法的预测性能与随机森林 (RF) 集成方法的预测性能进行了比较。最后,使用总体准确度 (Acc)、接收器操作特征曲线下面积 (AUC)、kappa 指数、均方根误差 (RMSE) 和 F 分数测量来评估方法的预测能力。为了进一步评估,使用 McNemar 检验来评估四种梯度增强模型之间差异的统计显着性。准确率结果表明,CatBoost 模型具有最高的预测能力(Acc=0.8503 和 AUC=0.8975),其次是 XGBoost(Acc=0.8336 和 AUC=0.8860)、LightGBM(Acc=0.8244 和 AUC=0.8796)和GBM(Acc= 0.8080 和 AUC= 0.8685)。另一方面,本研究中考虑的估计精度测量表明,与其他方法相比,RF 方法的预测能力最低。尽管发现这些方法的个别性能处于可接受的水平,但 CatBoost 方法在本研究中估计的 AUC 和 Acc 值方面显示出优于其他方法的性能。研究结果证实,相对较新的集成学习技术对于生成 LS 地图是有效且稳健的,此外,由于它们的稳健性,这些算法可能会在未来的研究中更频繁地被首选。

更新日期:2020-10-16
down
wechat
bug