当前位置: X-MOL 学术Toxicol. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting the Reproductive Toxicity of Chemicals Using Ensemble Learning Methods and Molecular Fingerprints
Toxicology Letters ( IF 2.9 ) Pub Date : 2021-04-01 , DOI: 10.1016/j.toxlet.2021.01.002
Huawei Feng , Li Zhang , Shimeng Li , Lili Liu , Tianzhou Yang , Pengyu Yang , Jian Zhao , Isaiah Tuvia Arkin , Hongsheng Liu

Reproductive toxicity endpoints are a significant safety concern in the assessment of the adverse effects of chemicals in drug discovery. Computational models that can accurately predict a chemical's toxic potential are increasingly pursued to replace traditional animal experiments. Thus, ensemble learning models were built to predict the reproductive toxicity of compounds. Our ensemble models were developed using support vector machine, random forest, and extreme gradient boosting methods and 9 molecular fingerprints calculated for a dataset containing 1823 chemicals. The best prediction performance was achieved by the Ensemble-Top12 model, with an accuracy (ACC) of 86.33%, a sensitivity (SEN) of 82.02%, a specificity (SPE) of 90.19%, and an area under the receiver operating characteristic curve (AUC) of 0.937 in 5-fold cross-validation and ACC, SEN, SPE, and AUC values of 84.38%, 86.90%, 90.67%, and 0.920, respectively, in external validation. We also defined the applicability domain (AD) of the ensemble model by calculating the Tanimoto distance of the training set. Compared with models in existing literature, our ensemble model achieves relatively high ACC, SPE and AUC values. We also identified several fingerprint features related to chemical reproductive toxicity. Considering the performance of model, we recommend using the Ensemble-Top12 model to predict reproductive toxicity in early drug development.

中文翻译:

使用集成学习方法和分子指纹预测化学品的生殖毒性

生殖毒性终点是评估药物发现中化学品不良影响的一个重要安全问题。越来越多地寻求能够准确预测化学品毒性潜力的计算模型来取代传统的动物实验。因此,建立了集成学习模型来预测化合物的生殖毒性。我们的集成模型是使用支持向量机、随机森林和极端梯度增强方法以及为包含 1823 种化学品的数据集计算的 9 个分子指纹开发的。最佳预测性能由Ensemble-Top12模型实现,准确度(ACC)为86.33%,灵敏度(SEN)为82.02%,特异性(SPE)为90.19%,受试者工作特征曲线下面积(AUC) 为 0。在 5 倍交叉验证中为 937,在外部验证中,ACC、SEN、SPE 和 AUC 值分别为 84.38%、86.90%、90.67% 和 0.920。我们还通过计算训练集的 Tanimoto 距离来定义集成模型的适用域 (AD)。与现有文献中的模型相比,我们的集成模型实现了相对较高的 ACC、SPE 和 AUC 值。我们还确定了几个与化学生殖毒性相关的指纹特征。考虑到模型的性能,我们推荐使用 Ensemble-Top12 模型来预测早期药物开发中的生殖毒性。与现有文献中的模型相比,我们的集成模型实现了相对较高的 ACC、SPE 和 AUC 值。我们还确定了几个与化学生殖毒性相关的指纹特征。考虑到模型的性能,我们推荐使用Ensemble-Top12模型来预测早期药物开发中的生殖毒性。与现有文献中的模型相比,我们的集成模型实现了相对较高的 ACC、SPE 和 AUC 值。我们还确定了几个与化学生殖毒性相关的指纹特征。考虑到模型的性能,我们推荐使用Ensemble-Top12模型来预测早期药物开发中的生殖毒性。
更新日期:2021-04-01
down
wechat
bug