当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MERMAID: an open source automated hit-to-lead method based on deep reinforcement learning
Journal of Cheminformatics ( IF 7.1 ) Pub Date : 2021-11-27 , DOI: 10.1186/s13321-021-00572-6
Daiki Erikawa 1 , Nobuaki Yasuo 2 , Masakazu Sekijima 1, 2
Affiliation  

The hit-to-lead process makes the physicochemical properties of the hit molecules that show the desired type of activity obtained in the screening assay more drug-like. Deep learning-based molecular generative models are expected to contribute to the hit-to-lead process. The simplified molecular input line entry system (SMILES), which is a string of alphanumeric characters representing the chemical structure of a molecule, is one of the most commonly used representations of molecules, and molecular generative models based on SMILES have achieved significant success. However, in contrast to molecular graphs, during the process of generation, SMILES are not considered as valid SMILES. Further, it is quite difficult to generate molecules starting from a certain molecule, thus making it difficult to apply SMILES to the hit-to-lead process. In this study, we have developed a SMILES-based generative model that can be generated starting from a certain molecule. This method generates partial SMILES and inserts it into the original SMILES using Monte Carlo Tree Search and a Recurrent Neural Network. We validated our method using a molecule dataset obtained from the ZINC database and successfully generated molecules that were both well optimized for the objectives of the quantitative estimate of drug-likeness (QED) and penalized octanol-water partition coefficient (PLogP) optimization. The source code is available at https://github.com/sekijima-lab/mermaid .

中文翻译:

MERMAID:一种基于深度强化学习的开源自动化hit-to-lead方法

hit-to-lead 过程使命中分子的理化特性显示出在筛选试验中获得的所需活性类型,更像药物。预计基于深度学习的分子生成模型将有助于实现先导过程。简化的分子输入行输入系统(SMILES)是表示分子化学结构的一串字母数字字符,是最常用的分子表示之一,基于SMILES的分子生成模型取得了显着的成功。但是,与分子图相反,在生成过程中,SMILES 不被视为有效的 SMILES。此外,从某个分子开始生成分子是相当困难的,因此很难将 SMILES 应用于hit-to-lead 过程。在这项研究中,我们开发了一个基于 SMILES 的生成模型,可以从某个分子开始生成。该方法使用蒙特卡洛树搜索和循环神经网络生成部分 SMILES 并将其插入到原始 SMILES 中。我们使用从 ZINC 数据库获得的分子数据集验证了我们的方法,并成功生成了针对药物相似性定量估计 (QED) 和惩罚辛醇-水分配系数 (PLogP) 优化目标进行了良好优化的分子。源代码可在 https://github.com/sekijima-lab/mermaid 获得。该方法使用蒙特卡洛树搜索和循环神经网络生成部分 SMILES 并将其插入到原始 SMILES 中。我们使用从 ZINC 数据库获得的分子数据集验证了我们的方法,并成功生成了针对药物相似性定量估计 (QED) 和惩罚辛醇-水分配系数 (PLogP) 优化目标进行了良好优化的分子。源代码可在 https://github.com/sekijima-lab/mermaid 获得。该方法使用蒙特卡洛树搜索和循环神经网络生成部分 SMILES 并将其插入到原始 SMILES 中。我们使用从 ZINC 数据库获得的分子数据集验证了我们的方法,并成功生成了针对药物相似性定量估计 (QED) 和惩罚辛醇-水分配系数 (PLogP) 优化目标进行了良好优化的分子。源代码可在 https://github.com/sekijima-lab/mermaid 获得。
更新日期:2021-11-27
down
wechat
bug