当前位置: X-MOL 学术Water Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Machine learning approach identifies water sample source based on microbial abundance
Water Research ( IF 11.4 ) Pub Date : 2021-04-27 , DOI: 10.1016/j.watres.2021.117185
Chenchen Wang , Guannan Mao , Kailingli Liao , Weiwei Ben , Meng Qiao , Yaohui Bai , Jiuhui Qu

Water quality can change along a river system due to differences in adjacent land use patterns and discharge sources. These variations can induce rapid responses of the aquatic microbial community, which may be an indicator of water quality characteristics. In the current study, we used a random forest model to predict water sample sources from three different river ecosystems along a gradient of anthropogenic disturbance (i.e., less disturbed mountainous area, wastewater discharged urban area, and pesticide and fertilizer applied agricultural area) based on environmental physicochemical indices (PCIs), microbiological indices (MBIs), and their combination. Results showed that among the PCI-based models, using conventional water quality indices as inputs provided markedly better prediction of water sample source than using pharmaceutical and personal care products (PPCPs), and much better prediction than using polycyclic aromatic hydrocarbons (PAHs) and substituted PAHs (SPAHs). Among the MBI-based models, using the abundances of the top 30 bacteria combined with pathogenic antibiotic resistant bacteria (PARB) as inputs achieved the lowest median out-of-bag error rate (9.9%) and increased median kappa coefficient (0.8694), while adding fungal inputs reduced the kappa coefficient. The model based on the top 30 bacteria still showed an advantage compared with models based on PCIs or the combination of PCIs and MBIs. With improvement in sequencing technology and increase in data availability in the future, the proposed method provides an economical, rapid, and reliable way in which to identify water sample sources based on abundance data of microbial communities.



中文翻译:

机器学习方法基于微生物丰度识别水样来源

由于相邻土地利用方式和排放源的差异,水质会沿河流系统发生变化。这些变化可以引起水生微生物群落的快速响应,这可能是水质特征的指标。在当前的研究中,我们使用随机森林模型,根据人为干扰(即受干扰程度较小的山区,废水排放的城市区域以及农药和化肥施用的农业区域)的梯度,预测了来自三个不同河流生态系统的水样源。环境物理化学指标(PCI),微生物指标(MBI)及其组合。结果表明,在基于PCI的模型中,与使用药品和个人护理产品(PPCP)相比,使用常规水质指数作为输入可提供对水样品来源的明显更好的预测,并且与使用多环芳烃(PAH)和取代的PAH(SPAH)相比,可以提供更好的预测。在基于MBI的模型中,使用前30种细菌的丰度与致病性抗生素抗性细菌(PARB)组合作为输入,可实现最低的平均袋外错误率(9.9%)和增加的平均Kappa系数(0.8694),同时添加真菌输入会降低kappa系数。与基于PCI或PCI和MBI组合的模型相比,基于前30种细菌的模型仍显示出优势。随着测序技术的改进和未来数据可用性的提高,提出的方法提供了一种经济,

更新日期:2021-05-11
down
wechat
bug