当前位置: X-MOL 学术SAR QSAR Environ. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fish early life stage toxicity prediction from acute daphnid toxicity and quantum chemistry
SAR and QSAR in Environmental Research ( IF 3 ) Pub Date : 2021-02-02
S. Schmidt, M. Schindler, D. Faber, J. Hager

ABSTRACT

One step towards reduced animal testing is the use of in silico screening methods to predict toxicity of chemicals, which requires high-quality data to develop models that are reliable and clearly interpretable. We compiled a large data set of fish early life stage no observed effect concentration endpoints (FELS NOEC) based on published data sources and internal studies, containing data for 338 molecules. Furthermore, we developed a new quantitative structure-activity-activity relationship (QSAAR) model to inform estimation of this endpoint using a combination of dimensionality reduction, regularization, and domain knowledge. In particular, we made use of a sparse partial least squares algorithm (sPLS) to select relevant variables from a huge number of molecular descriptors ranging from topological to quantum chemical properties. The final QSAAR model is of low complexity, consisting of 2 latent variables based on 8 molecular descriptors and experimental Daphnia magna acute data (EC50, 48 h). We provide a mechanistic interpretation of each model parameter. The model performs well, with a coefficient of determination r 2 of 0.723 on the training set (cross-validated q 2 = 0.686) and comparable predictivity on a test data set of chemically related molecules with experimental Daphnia magna data (r 2 test = 0.687, RMSE = 0.793 log units).



中文翻译:

从急性水蚤毒性和量子化学预测鱼的生命早期毒性

摘要

减少动物试验的第一步是使用计算机筛选方法来预测化学物质的毒性,这需要高质量的数据来开发可靠且易于解释的模型。基于公开的数据来源和内部研究,我们收集了鱼类的生命早期阶段的未观察到的有效浓度终点(FELS NOEC)的大型数据集,其中包含338个分子的数据。此外,我们开发了一种新的定量结构-活性-活性关系(QSAAR)模型,以结合降维,正则化和领域知识来通知此终点的估计。特别是,我们利用稀疏的偏最小二乘算法(sPLS)从大量分子描述符中选择了相关变量,这些描述符从拓扑到量子化学性质不等。大型蚤Daphnia magna)急性数据(EC50,48小时)。我们提供了每个模型参数的机械解释。该模型运行良好,在训练集上的确定系数r 2为0.723(交叉验证的q 2  = 0.686),并且在化学相关分子的测试数据集上具有实验性水蚤的实验数据(r 2 检验 = 0.687 )具有可比的预测性。,RMSE = 0.793对数单位)。

更新日期:2021-02-02
down
wechat
bug