当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation
Journal of Cheminformatics ( IF 7.1 ) Pub Date : 2022-10-03 , DOI: 10.1186/s13321-022-00646-z
Morgan Thomas 1 , Noel M O'Boyle 2 , Andreas Bender 1 , Chris de Graaf 2
Affiliation  

A plethora of AI-based techniques now exists to conduct de novo molecule generation that can devise molecules conditioned towards a particular endpoint in the context of drug design. One popular approach is using reinforcement learning to update a recurrent neural network or language-based de novo molecule generator. However, reinforcement learning can be inefficient, sometimes requiring up to 105 molecules to be sampled to optimize more complex objectives, which poses a limitation when using computationally expensive scoring functions like docking or computer-aided synthesis planning models. In this work, we propose a reinforcement learning strategy called Augmented Hill-Climb based on a simple, hypothesis-driven hybrid between REINVENT and Hill-Climb that improves sample-efficiency by addressing the limitations of both currently used strategies. We compare its ability to optimize several docking tasks with REINVENT and benchmark this strategy against other commonly used reinforcement learning strategies including REINFORCE, REINVENT (version 1 and 2), Hill-Climb and best agent reminder. We find that optimization ability is improved ~ 1.5-fold and sample-efficiency is improved ~ 45-fold compared to REINVENT while still delivering appealing chemistry as output. Diversity filters were used, and their parameters were tuned to overcome observed failure modes that take advantage of certain diversity filter configurations. We find that Augmented Hill-Climb outperforms the other reinforcement learning strategies used on six tasks, especially in the early stages of training or for more difficult objectives. Lastly, we show improved performance not only on recurrent neural networks but also on a reinforcement learning stabilized transformer architecture. Overall, we show that Augmented Hill-Climb improves sample-efficiency for language-based de novo molecule generation conditioning via reinforcement learning, compared to the current state-of-the-art. This makes more computationally expensive scoring functions, such as docking, more accessible on a relevant timescale.

中文翻译:


增强爬山提高了基于语言的从头分子生成的强化学习效率



现在存在大量基于人工智能的技术来进行从头分子生成,这些技术可以设计出在药物设计中针对特定终点的分子。一种流行的方法是使用强化学习来更新循环神经网络或基于语言的从头分子生成器。然而,强化学习可能效率低下,有时需要对多达 105 个分子进行采样才能优化更复杂的目标,这在使用计算昂贵的评分函数(如对接或计算机辅助合成规划模型)时造成了限制。在这项工作中,我们提出了一种名为 Augmented Hill-Climb 的强化学习策略,该策略基于 REINVENT 和 Hill-Climb 之间简单的、假设驱动的混合,通过解决当前使用的两种策略的局限性来提高样本效率。我们比较了它与 REINVENT 优化多个对接任务的能力,并将该策略与其他常用的强化学习策略(包括 REINFORCE、REINVENT(版本 1 和 2)、Hill-Climb 和最佳代理提醒)进行基准测试。我们发现,与 REINVENT 相比,优化能力提高了约 1.5 倍,样品效率提高了约 45 倍,同时仍然提供了有吸引力的化学输出。使用分集滤波器,并调整其参数以克服利用某些分集滤波器配置观察到的故障模式。我们发现增强爬山法在六项任务上优于其他强化学习策略,特别是在训练的早期阶段或更困难的目标。最后,我们不仅在循环神经网络上展示了性能的改进,而且在强化学习稳定变压器架构上也展示了性能的改进。 总体而言,我们表明,与当前最先进的技术相比,增强爬山通过强化学习提高了基于语言的从头分子生成调节的样本效率。这使得计算成本更高的评分功能(例如对接)在相关时间尺度上更容易使用。
更新日期:2022-10-04
down
wechat
bug