当前位置: X-MOL 学术J. Water Health › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Assessment of groundwater arsenic contamination using machine learning in Varanasi, Uttar Pradesh, India
Journal of Water & Health ( IF 2.5 ) Pub Date : 2022-05-01 , DOI: 10.2166/wh.2022.015
S Kumar 1 , J Pati 1
Affiliation  

This paper presents a machine learning approach for classification of arsenic (As) levels as safe and unsafe in groundwater samples collected from the Indo-Gangetic region. As water is essential for sustaining life, heavy metals like arsenic pose a public health concern. In this study, various tree-based machine learning models namely Random Forest, Optimized Forest, CS Forest, SPAARC, and REP Tree algorithms have been applied to classify water samples. As per the guidelines of the World Health Organization (WHO), the arsenic concentration in water should not exceed 10 μg/L. The groundwater quality parameter was ranked using a classifier attribute evaluator for training and testing the models. Parameters obtained from the confusion matrix, such as accuracy, precision, recall, and FPR, were used to analyze the performance of models. Among all models, Optimized Forest outperforms other classifier as it has a high accuracy of 80.64%, a precision of 80.70%, recall of 97.87%, and a low FPR of 73.33%. The Optimized Forest model can be used to test new water samples for classification of arsenic in groundwater samples.



中文翻译:

在印度北方邦瓦拉纳西使用机器学习评估地下水砷污染

本文提出了一种机器学习方法,用于将从印度恒河地区采集的地下水样本中的砷 (As) 水平分类为安全和不安全。由于水对于维持生命至关重要,因此砷等重金属构成了公共卫生问题。在这项研究中,各种基于树的机器学习模型,即随机森林、优化森林、CS 森林、SPAARC 和 REP 树算法已被应用于水样分类。根据世界卫生组织 (WHO) 的指导方针,水中的砷浓度不应超过 10 μg/L。使用分类器属性评估器对地下水质量参数进行排序,以训练和测试模型。从混淆矩阵中获得的参数,如准确率、精度、召回率和 FPR,用于分析模型的性能。在所有模型中,Optimized Forest 优于其他分类器,因为它具有 80.64% 的高准确率、80.70% 的准确率、97.87% 的召回率和 73.33% 的低 FPR。优化森林模型可用于测试新水样以对地下水样品中的砷进行分类。

更新日期:2022-05-01
down
wechat
bug