当前位置: X-MOL 学术J. Hazard. Mater. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting crop root concentration factors of organic contaminants with machine learning models
Journal of Hazardous Materials ( IF 12.2 ) Pub Date : 2021-10-05 , DOI: 10.1016/j.jhazmat.2021.127437
Feng Gao 1 , Yike Shen 2 , J Brett Sallach 3 , Hui Li 4 , Wei Zhang 4 , Yuanbo Li 5 , Cun Liu 6
Affiliation  

Accurate prediction of uptake and accumulation of organic contaminants by crops from soils is essential to assessing human exposure via the food chain. However, traditional empirical or mechanistic models frequently show variable performance due to complex interactions among contaminants, soils, and plants. Thus, in this study different machine learning algorithms were compared and applied to predict root concentration factors (RCFs) based on a dataset comprising 57 chemicals and 11 crops, followed by comparison with a traditional linear regression model as the benchmark. The RCF patterns and predictions were investigated by unsupervised t-distributed stochastic neighbor embedding and four supervised machine learning models including Random Forest, Gradient Boosting Regression Tree, Fully Connected Neural Network, and Supporting Vector Regression based on 15 property descriptors. The Fully Connected Neural Network demonstrated superior prediction performance for RCFs (R2 = 0.79, mean absolute error [MAE] = 0.22) over other machine learning models (R2 = 0.68–0.76, MAE = 0.23–0.26). All four machine learning models performed better than the traditional linear regression model (R2 = 0.62, MAE = 0.29). Four key property descriptors were identified in predicting RCFs. Specifically, increasing root lipid content and decreasing soil organic matter content increased RCFs, while increasing excess molar refractivity and molecular volume of contaminants decreased RCFs. These results show that machine learning models can improve prediction accuracy by learning nonlinear relationships between RCFs and properties of contaminants, soils, and plants.



中文翻译:

使用机器学习模型预测有机污染物的作物根系浓度因子

准确预测农作物对土壤中有机污染物的吸收和积累对于评估人类通过食物链的暴露至关重要。然而,由于污染物、土壤和植物之间的复杂相互作用,传统的经验或机械模型经常表现出可变的性能。因此,在本研究中,基于包含 57 种化学品和 11 种作物的数据集,比较并应用不同的机器学习算法来预测根浓度因子 (RCF),然后与作为基准的传统线性回归模型进行比较。RCF 模式和预测通过无监督 t 分布随机邻域嵌入和四种监督机器学习模型(包括随机森林、梯度提升回归树、全连接神经网络、以及基于 15 个属性描述符的支持向量回归。全连接神经网络对 RCF(R2 =  0.79,平均绝对误差 [MAE] = 0.22) 优于其他机器学习模型 (R 2 =  0.68–0.76, MAE = 0.23–0.26)。所有四种机器学习模型的表现都优于传统的线性回归模型(R 2 =  0.62,MAE = 0.29)。在预测 RCF 时确定了四个关键属性描述符。具体而言,增加根系脂质含量和降低土壤有机质含量会增加 RCF,而增加过量摩尔折射率和污染物分子体积会降低 RCF。这些结果表明,机器学习模型可以通过学习 RCF 与污染物、土壤和植物特性之间的非线性关系来提高预测准确性。

更新日期:2021-10-20
down
wechat
bug