当前位置: X-MOL 学术Chemosphere › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Development of models predicting biodegradation rate rating with multiple linear regression and support vector machine algorithms.
Chemosphere ( IF 8.1 ) Pub Date : 2020-04-04 , DOI: 10.1016/j.chemosphere.2020.126666
Weihao Tang 1 , Yanying Li 1 , Yang Yu 2 , Zhongyu Wang 1 , Tong Xu 1 , Jingwen Chen 1 , Jun Lin 2 , Xuehua Li 1
Affiliation  

Biodegradation is a significant process for removing organic chemicals from water, soil and sediment environments, and therefore biodegradability is critical to evaluate the environmental persistence of organic chemicals. In this study, based on a dataset with 171 compounds, four quantitative structure-activity relationship (QSAR) models were developed for predicting primary and ultimate biodegradation rate rating with multiple linear regression (MLR) and support vector machine (SVM) algorithms. Two MLR models were built with a dataset with carbon atom number ≤9, and two SVM models were built with a dataset with carbon atom number >9. In the MLR models, nArX (number of X on aromatic ring) is the most important descriptor governing primary and ultimate biodegradation of organic chemicals. For the SVM models, determination coefficient (R2) values, cross-validated coefficients (Q2LOO) and external validation coefficient (Q2ext) values are over 0.9, indicating the SVM models have satisfactory goodness-of-fit, robustness and external predictive abilities. The applicability domains of these models were visualized by the Williams plot. The developed models can be used as effective tools to predict biodegradability of organic chemicals.

中文翻译:

利用多重线性回归和支持向量机算法预测生物降解速率等级的模型的开发。

生物降解是从水,土壤和沉积物环境中去除有机化学物质的重要过程,因此生物降解性对于评估有机化学物质的环境持久性至关重要。在这项研究中,基于包含171种化合物的数据集,开发了四个定量结构-活性关系(QSAR)模型,用于通过多重线性回归(MLR)和支持向量机(SVM)算法预测主要和最终生物降解速率等级。使用碳原子数≤9的数据集构建了两个MLR模型,使用碳原子数> 9的数据集构建了两个SVM模型。在MLR模型中,nArX(芳环上的X数)是控制有机化学品主要和最终生物降解的最重要描述符。对于SVM模型,确定系数(R2)值 交叉验证系数(Q2LOO)和外部验证系数(Q2ext)值均超过0.9,这表明SVM模型具有令人满意的拟合优度,鲁棒性和外部预测能力。这些模型的适用范围通过威廉姆斯图可视化。开发的模型可以用作预测有机化学品可生物降解性的有效工具。
更新日期:2020-04-06
down
wechat
bug