当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Molecular generation by Fast Assembly of (Deep)SMILES fragments
Journal of Cheminformatics ( IF 8.6 ) Pub Date : 2021-11-14 , DOI: 10.1186/s13321-021-00566-4
Francois Berenger 1 , Koji Tsuda 1
Affiliation  

In recent years, in silico molecular design is regaining interest. To generate on a computer molecules with optimized properties, scoring functions can be coupled with a molecular generator to design novel molecules with a desired property profile. In this article, a simple method is described to generate only valid molecules at high frequency ( $$>300,000$$ molecule/s using a single CPU core), given a molecular training set. The proposed method generates diverse SMILES (or DeepSMILES) encoded molecules while also showing some propensity at training set distribution matching. When working with DeepSMILES, the method reaches peak performance ( $$>340,000$$ molecule/s) because it relies almost exclusively on string operations. The “Fast Assembly of SMILES Fragments” software is released as open-source at https://github.com/UnixJunkie/FASMIFRA . Experiments regarding speed, training set distribution matching, molecular diversity and benchmark against several other methods are also shown.

中文翻译:

通过(深度)SMILES 片段的快速组装生成分子

近年来,计算机分子设计重新引起人们的兴趣。为了在计算机上生成具有优化特性的分子,可以将评分函数与分子发生器结合以设计具有所需特性的新分子。在本文中,描述了一种简单的方法,在给定分子训练集的情况下,仅以高频($$>300,000$$ 分子/秒,使用单个 CPU 内核)生成有效分子。所提出的方法生成不同的 SMILES(或 DeepSMILES)编码分子,同时在训练集分布匹配方面也显示出一些倾向。使用 DeepSMILES 时,该方法达到了最高性能($$>340,000$$ 分子/秒),因为它几乎完全依赖于字符串操作。“微笑片段的快速组装”软件在 https://github.com/UnixJunkie/FASMIFRA 作为开源发布。
更新日期:2021-11-15
down
wechat
bug