当前位置: X-MOL 学术Appl. Geochem. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction on the fluoride contamination in groundwater at the Datong Basin, Northern China: Comparison of random forest, logistic regression and artificial neural network
Applied Geochemistry ( IF 3.4 ) Pub Date : 2021-07-24 , DOI: 10.1016/j.apgeochem.2021.105054
Mouigni Baraka Nafouanti 1 , Junxia Li 1, 2 , Nasiru Abba Mustapha 1, 3 , Placide Uwamungu 4 , Dalal AL-Alimi 5
Affiliation  

Groundwater fluoride is posing a health risk to humans, and analyzing groundwater quality is time-wasting and expensive. Statistical methods provide a valuable approach to study the spatial distribution of groundwater fluoride. Random Forest (RF), Artificial Neural Network (ANN), and Logistic Regression (LR) were used in this study for groundwater fluoride prediction in Datong Basin. The groundwater chemistry of 482 groundwater samples was collected and used to figure out the performance of three statistical technologies and extract the main factors controlling the enrichment of fluoride in groundwater. The data was separated into two parts for the statistical analysis, 80% for training and 20% for testing. The Chi-squared was applied to select the most relevant variables, and TDS, Cl, NO3, Na+, HCO3, SO42−, K+, Zn, Ca2+, and Mg2+ were selected as best inputs for the fluoride prediction. Models were evaluated using the confusion matrix and The receiver operating characteristic area under the curve ROC (AUC). The results suggest that within ten input variables, the accuracies of RF, ANN, and LR were 0.89, 0.85, and 0.76, respectively. The mean decrease in impurity (MDI) and permutation feature demonstrates that eight of ten parameters, including TDS, Cl, NO3, Na+, HCO3, SO42−, Ca2+ and Mg2+ are the variables influencing the groundwater fluoride in the study area. RF exhibited the best model with high conformity and confidence in predicting groundwater fluoride contamination in the study area.



中文翻译:

华北大同盆地地下水氟污染预测:随机森林、逻辑回归和人工神经网络的比较

地下水氟化物对人类健康构成威胁,分析地下水质量既费时又费钱。统计方法为研究地下水氟化物的空间分布提供了一种有价值的方法。本研究采用随机森林 (RF)、人工神经网络 (ANN) 和逻辑回归 (LR) 对大同盆地地下水氟化物进行预测。采集了482个地下水样品的地下水化学成分,用于计算三种统计技术的性能,提取控制地下水中氟化物富集的主要因素。将数据分成两部分进行统计分析,80% 用于训练,20% 用于测试。卡方用于选择最相关的变量,TDS、Cl -、NO 3、Na +、HCO 3 -、SO 4 2-、K +、Zn、Ca 2+和 Mg 2+被选为氟化物预测的最佳输入。使用混淆矩阵和曲线下的受试者操作特征区域 ROC (AUC) 评估模型。结果表明,在十个输入变量中,RF、ANN 和 LR 的准确度分别为 0.89、0.85 和 0.76。杂质 (MDI) 和置换特征的平均减少表明十个参数中的八个,包括 TDS、Cl -、NO 3 -、Na +、HCO 3 -、SO 42−、Ca 2+和Mg 2+是影响研究区地下水氟化物的变量。RF 展示了在预测研究区地下水氟化物污染方面具有高度一致性和信心的最佳模型。

更新日期:2021-08-05
down
wechat
bug