当前位置: X-MOL 学术SAR QSAR Environ. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Could deep learning in neural networks improve the QSAR models?
SAR and QSAR in Environmental Research ( IF 2.3 ) Pub Date : 2019-08-28 , DOI: 10.1080/1062936x.2019.1650827
G Gini 1 , F Zanoli 1 , A Gamba 2 , G Raitano 2 , E Benfenati 2
Affiliation  

Assessing chemical toxicity is a multidisciplinary process, traditionally involving in vivo, in vitro and in silico tests. Currently, toxicological goal is to reduce new tests on chemicals, exploiting all information yet available. Recent advancements in machine learning and deep neural networks allow computers to automatically mine patterns and learn from data. This technology, applied to (Q)SAR model development, leads to discover by learning the structural-chemical-biological relationships and the emergent properties. Starting from Toxception, a deep neural network predicting activity from the chemical graph image, we designed SmilesNet, a recurrent neural network taking SMILES as the only input. We then integrated the two networks into C-Tox network to make the final classification. Results of our networks, trained on a ~20K molecule dataset with Ames test experimental values, match or even outperform the current state of the art. We also extract knowledge from the networks and compare it with the available mutagenic structural alerts. The advantage over traditional QSAR modelling is that our models automatically extract the features without using descriptors. Nevertheless, the model is successful if large numbers of examples are provided and computation is more complex than in classical methods.



中文翻译:

神经网络中的深度学习能否改善QSAR模型?

评估化学毒性是一个多学科的过程,传统上涉及体内,体外和计算机模拟测试。目前,毒理学的目标是利用所有可用信息来减少对化学药品的新检测。机器学习和深度神经网络的最新进展允许计算机自动挖掘模式并从数据中学习。这项技术应用于(Q)SAR模型开发,通过学习结构-化学-生物学关系和新兴特性来发现。从Toxception(一个从化学图图像预测活动的深层神经网络)开始,我们设计了SmilesNet,SmilesNet是一个以SMILES作为唯一输入的递归神经网络。然后,我们将这两个网络集成到C-Tox网络中以进行最终分类。我们网络的结果,使用Ames测试实验值在约20K分子数据集上进行训练,可以匹配甚至胜过当前技术水平。我们还将从网络中提取知识,并将其与可用的诱变结构警报进行比较。与传统QSAR建模相比,其优势在于我们的模型无需使用描述符即可自动提取特征。但是,如果提供了大量示例,并且模型比经典方法复杂,那么该模型将是成功的。

更新日期:2019-08-28
down
wechat
bug