当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DrugEx v2: de novo design of drug molecules by Pareto-based multi-objective reinforcement learning in polypharmacology
Journal of Cheminformatics ( IF 8.6 ) Pub Date : 2021-11-12 , DOI: 10.1186/s13321-021-00561-9
Xuhan Liu 1 , Kai Ye 2 , Herman W T van Vlijmen 1, 3 , Michael T M Emmerich 4 , Adriaan P IJzerman 1 , Gerard J P van Westen 1
Affiliation  

In polypharmacology drugs are required to bind to multiple specific targets, for example to enhance efficacy or to reduce resistance formation. Although deep learning has achieved a breakthrough in de novo design in drug discovery, most of its applications only focus on a single drug target to generate drug-like active molecules. However, in reality drug molecules often interact with more than one target which can have desired (polypharmacology) or undesired (toxicity) effects. In a previous study we proposed a new method named DrugEx that integrates an exploration strategy into RNN-based reinforcement learning to improve the diversity of the generated molecules. Here, we extended our DrugEx algorithm with multi-objective optimization to generate drug-like molecules towards multiple targets or one specific target while avoiding off-targets (the two adenosine receptors, A1AR and A2AAR, and the potassium ion channel hERG in this study). In our model, we applied an RNN as the agent and machine learning predictors as the environment. Both the agent and the environment were pre-trained in advance and then interplayed under a reinforcement learning framework. The concept of evolutionary algorithms was merged into our method such that crossover and mutation operations were implemented by the same deep learning model as the agent. During the training loop, the agent generates a batch of SMILES-based molecules. Subsequently scores for all objectives provided by the environment are used to construct Pareto ranks of the generated molecules. For this ranking a non-dominated sorting algorithm and a Tanimoto-based crowding distance algorithm using chemical fingerprints are applied. Here, we adopted GPU acceleration to speed up the process of Pareto optimization. The final reward of each molecule is calculated based on the Pareto ranking with the ranking selection algorithm. The agent is trained under the guidance of the reward to make sure it can generate desired molecules after convergence of the training process. All in all we demonstrate generation of compounds with a diverse predicted selectivity profile towards multiple targets, offering the potential of high efficacy and low toxicity.

中文翻译:

DrugEx v2:通过多药理学中基于帕累托的多目标强化学习从头设计药物分子

在多药理学中,药物需要与多个特定目标结合,例如以提高疗效或减少耐药性的形成。尽管深度学习在药物发现的从头设计上取得了突破,但其大部分应用仅针对单一药物靶点来生成类药物活性分子。然而,实际上药物分子经常与不止一种靶标相互作用,这些靶标可能具有期望的(多药理学)或不期望的(毒性)效应。在之前的一项研究中,我们提出了一种名为 DrugEx 的新方法,该方法将探索策略集成到基于 RNN 的强化学习中,以提高生成分子的多样性。这里,我们通过多目标优化扩展了我们的 DrugEx 算法,以生成针对多个目标或一个特定目标的类药物分子,同时避免脱靶(本研究中的两个腺苷受体,A1AR 和 A2AAR,以及钾离子通道 hERG)。在我们的模型中,我们将 RNN 用作代理,将机器学习预测器用作环境。代理和环境都预先进行了预训练,然后在强化学习框架下相互作用。进化算法的概念被合并到我们的方法中,使得交叉和变异操作由与代理相同的深度学习模型实现。在训练循环期间,代理生成一批基于 SMILES 的分子。随后环境提供的所有目标的分数用于构建生成的分子的帕累托等级。对于该排名,应用了非支配排序算法和使用化学指纹的基于 Tanimoto 的拥挤距离算法。在这里,我们采用了 GPU 加速来加速帕累托优化的过程。每个分子的最终奖励是根据帕累托排名和排名选择算法计算的。代理在奖励的指导下进行训练,以确保在训练过程收敛后能够生成所需的分子。总而言之,我们展示了对多个目标具有不同预测选择性的化合物的生成,提供了高效和低毒性的潜力。对于该排名,应用了非支配排序算法和使用化学指纹的基于 Tanimoto 的拥挤距离算法。在这里,我们采用了 GPU 加速来加速帕累托优化的过程。每个分子的最终奖励是根据帕累托排名和排名选择算法计算的。代理在奖励的指导下进行训练,以确保在训练过程收敛后能够生成所需的分子。总而言之,我们展示了对多个目标具有不同预测选择性的化合物的生成,提供了高效和低毒性的潜力。对于该排名,应用了非支配排序算法和使用化学指纹的基于 Tanimoto 的拥挤距离算法。在这里,我们采用了 GPU 加速来加速帕累托优化的过程。每个分子的最终奖励是根据帕累托排名和排名选择算法计算的。代理在奖励的指导下进行训练,以确保在训练过程收敛后能够生成所需的分子。总而言之,我们展示了对多个目标具有不同预测选择性的化合物的生成,提供了高效和低毒性的潜力。每个分子的最终奖励是根据帕累托排名和排名选择算法计算的。代理在奖励的指导下进行训练,以确保在训练过程收敛后能够生成所需的分子。总而言之,我们展示了对多个目标具有不同预测选择性的化合物的生成,提供了高效和低毒性的潜力。每个分子的最终奖励是根据帕累托排名和排名选择算法计算的。代理在奖励的指导下进行训练,以确保在训练过程收敛后能够生成所需的分子。总而言之,我们展示了对多个目标具有不同预测选择性的化合物的生成,提供了高效和低毒性的潜力。
更新日期:2021-11-12
down
wechat
bug