当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Explainable Molecular Sets: Using Information Theory to Generate Meaningful Descriptions of Groups of Molecules
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2021-10-12 , DOI: 10.1021/acs.jcim.1c00519
Adam C Mater 1 , Michelle L Coote 1
Affiliation  

Algorithmically identifying the meaningful similarities between an assortment of molecules is a critical chemical problem, and one which is only gaining in relevance as data-driven chemistry continues to progress. Effectively addressing this challenge can be achieved through a reformulation of the problem into information theory, cluster-based supervised classification, and the implementation of key concepts, particularly information entropy and mutual information. These concepts are combined with unsupervised learning atop learned chemical spaces to generate meaningful labels for arbitrary collections of molecules. An open-source and highly extensible codebase is provided to undertake these experiments, demonstrate the viability of the approach on known clusters, and glean insights into the learned representations of chemical space within message-passing neural networks, an architecture not readily permitting interpretability. This approach facilitates the interoperability between human chemical knowledge and the algorithmically derived insights, which will continue to become more prevalent in the coming years.

中文翻译:

可解释的分子集:使用信息论生成对分子群有意义的描述

通过算法识别各种分子之间有意义的相似性是一个关键的化学问题,随着数据驱动化学的不断发展,这个问题只会越来越相关。可以通过将问题重新表述为信息理论、基于集群的监督分类以及关键概念(尤其是信息熵和互信息)的实施来有效应对这一挑战。这些概念与学习化学空间之上的无监督学习相结合,为任意分子集合生成有意义的标签。提供了一个开源且高度可扩展的代码库来进行这些实验,证明该方法在已知集群上的可行性,并深入了解信息传递神经网络中化学空间的学习表示,这种架构不容易允许可解释性。这种方法促进了人类化学知识和算法得出的见解之间的互操作性,这将在未来几年继续变得更加普遍。
更新日期:2021-10-25
down
wechat
bug