当前位置: X-MOL 学术arXiv.cs.DS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Novel Method for Inference of Acyclic Chemical Compounds with Bounded Branch-height Based on Artificial Neural Networks and Integer Programming
arXiv - CS - Data Structures and Algorithms Pub Date : 2020-09-21 , DOI: arxiv-2009.09646
Naveed Ahmed Azam, Jianshen Zhu, Yanming Sun, Yu Shi, Aleksandar Shurbevski, Liang Zhao, Hiroshi Nagamochi, Tatsuya Akutsu

Analysis of chemical graphs is a major research topic in computational molecular biology due to its potential applications to drug design. One approach is inverse quantitative structure activity/property relationship (inverse QSAR/QSPR) analysis, which is to infer chemical structures from given chemical activities/properties. Recently, a framework has been proposed for inverse QSAR/QSPR using artificial neural networks (ANN) and mixed integer linear programming (MILP). This method consists of a prediction phase and an inverse prediction phase. In the first phase, a feature vector $f(G)$ of a chemical graph $G$ is introduced and a prediction function $\psi$ on a chemical property $\pi$ is constructed with an ANN. In the second phase, given a target value $y^*$ of property $\pi$, a feature vector $x^*$ is inferred by solving an MILP formulated from the trained ANN so that $\psi(x^*)$ is close to $y^*$ and then a set of chemical structures $G^*$ such that $f(G^*)= x^*$ is enumerated by a graph search algorithm. The framework has been applied to the case of chemical compounds with cycle index up to 2. The computational results conducted on instances with $n$ non-hydrogen atoms show that a feature vector $x^*$ can be inferred for up to around $n=40$ whereas graphs $G^*$ can be enumerated for up to $n=15$. When applied to the case of chemical acyclic graphs, the maximum computable diameter of $G^*$ was around up to around 8. We introduce a new characterization of graph structure, "branch-height," based on which an MILP formulation and a graph search algorithm are designed for chemical acyclic graphs. The results of computational experiments using properties such as octanol/water partition coefficient, boiling point and heat of combustion suggest that the proposed method can infer chemical acyclic graphs $G^*$ with $n=50$ and diameter 30.

中文翻译:

一种基于人工神经网络和整数规划的无环化合物有界分支高度推断新方法

由于其在药物设计中的潜在应用,化学图分析是计算分子生物学的主要研究课题。一种方法是逆定量结构活性/性质关系(逆 QSAR/QSPR)分析,即从给定的化学活性/性质推断化学结构。最近,已经提出了使用人工神经网络 (ANN) 和混合整数线性规划 (MILP) 的逆 QSAR/QSPR 框架。该方法由预测阶段和逆预测阶段组成。在第一阶段,引入化学图$G$的特征向量$f(G)$,并使用人工神经网络构建化学性质$\pi$的预测函数$\psi$。在第二阶段,给定属性 $\pi$ 的目标值 $y^*$,特征向量 $x^*$ 是通过求解从训练的 ANN 制定的 MILP 来推断的,这样 $\psi(x^*)$ 接近 $y^*$,然后是一组化学结构 $G^*$这样 $f(G^*)= x^*$ 由图搜索算法枚举。该框架已应用于循环指数高达 2 的化合物的情况。 在具有 $n$ 非氢原子的实例上进行的计算结果表明,可以推断出特征向量 $x^*$ 最多约为 $ n=40$ 而图形 $G^*$ 最多可以枚举为 $n=15$。当应用于化学无环图的情况时,$G^*$ 的最大可计算直径约为 8。我们引入了图结构的新特征“分支高度”,基于此,MILP 公式和图搜索算法是为化学无环图设计的。
更新日期:2020-09-22
down
wechat
bug