当前位置: X-MOL 学术Chem › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An evolutionary algorithm for interpretable molecular representations
Chem ( IF 23.5 ) Pub Date : 2024-02-29 , DOI: 10.1016/j.chempr.2024.02.004
Philipp M. Pflüger , Marius Kühnemund , Felix Katzenburg , Herbert Kuchen , Frank Glorius

Encoding molecular structures into a computer-readable, utilizable format is the key step for any machine learning application in all chemical sciences. Current representations vary strongly in complexity and shape, depending on the application. Therefore, the number of domain-specific representations is rapidly growing, with some being altered and retuned constantly. These tailored representations raise the barriers for entry and method adaption, thus decelerating progress in application. Herein, we present a general algorithm capable of yielding a highly specific representation solely based on a given dataset. The algorithm utilizes structural queries and evolutionary methodologies to generate interpretable molecular fingerprints. These are highly suited for molecular machine learning, enabling the accurate prediction of reactivity, property, and biological activity. We demonstrate its native interpretability, allowing for the extraction of knowledge, such as reactivity trends. We anticipate that the evolutionary multipattern fingerprint (EvoMPF) will be used to discover structure-target relationships in different molecular sciences.



中文翻译:

可解释分子表示的进化算法

将分子结构编码为计算机可读、可利用的格式是所有化学科学中任何机器学习应用的关键步骤。当前的表示形式在复杂性和形状方面差异很大,具体取决于应用程序。因此,特定领域表示的数量正在快速增长,其中一些表示不断被更改和重新调整。这些定制的表示增加了进入和方法适应的障碍,从而减缓了应用的进展。在这里,我们提出了一种通用算法,能够仅根据给定的数据集产生高度特定的表示。该算法利用结构查询和进化方法来生成可解释的分子指纹。这些非常适合分子机器学习,能够准确预测反应性、性质和生物活性。我们展示了其固有的可解释性,允许提取知识,例如反应性趋势。我们预计进化多模式指纹(EvoMPF)将用于发现不同分子科学中的结构-靶标关系。

更新日期:2024-02-29
down
wechat
bug