当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Activity landscape image analysis using convolutional neural networks
Journal of Cheminformatics ( IF 8.6 ) Pub Date : 2020-05-18 , DOI: 10.1186/s13321-020-00436-5
Javed Iqbal , Martin Vogt , Jürgen Bajorath

Activity landscapes (ALs) are graphical representations that combine compound similarity and activity data. ALs are constructed for visualizing local and global structure–activity relationships (SARs) contained in compound data sets. Three-dimensional (3D) ALs are reminiscent of geographical maps where differences in landscape topology mirror different SAR characteristics. 3D AL models can be stored as differently formatted images and are thus amenable to image analysis approaches, which have thus far not been considered in the context of graphical SAR analysis. In this proof-of-concept study, 3D ALs were constructed for a variety of compound activity classes and 3D AL image variants of varying topology and information content were generated and classified. To these ends, convolutional neural networks (CNNs) were initially applied to images of original 3D AL models with color-coding reflecting compound potency information that were taken from different viewpoints. Images of 3D AL models were transformed into variants from which one-dimensional features were extracted. Other machine learning approaches including support vector machine (SVM) and random forest (RF) algorithms were applied to derive models on the basis of such features. In addition, SVM and RF models were trained using other features obtained from images through edge filtering. Machine learning was able to accurately distinguish between 3D AL image variants with different topology and information content. Overall, CNNs which directly learned feature representations from 3D AL images achieved highest classification accuracy. Predictive performance for CNN, SVM, and RF models was highest for image variants emphasizing topological elevation. In addition, SVM models trained on rudimentary images from edge filtering classified such images with high accuracy, which further supported the critical role of altitude-dependent topological features for image analysis and predictions. Taken together, the findings of our proof-of-concept investigation indicate that image analysis has considerable potential for graphical SAR exploration to systematically infer different SAR characteristics from topological features of 3D ALs.

中文翻译:

利用卷积神经网络进行活动景观图像分析

活动景观(AL)是结合了化合物相似性和活动数据的图形表示。AL旨在可视化包含在复合数据集中的局部和全局结构-活动关系(SAR)。三维(3D)AL让人想起地理地图,其中景观拓扑的差异反映了不同的SAR特征。3D AL模型可以存储为不同格式的图像,因此适用于图像分析方法,这在图形SAR分析的背景下至今尚未考虑。在此概念验证研究中,针对各种化合物活性类别构建了3D AL,并生成了具有不同拓扑结构和信息内容的3D AL图像变体并进行了分类。为此,卷积神经网络(CNN)最初应用于原始3D AL模型的图像,其颜色编码反映了从不同角度获取的复合效能信息。将3D AL模型的图像转换为从中提取一维特征的变体。基于此类功能,还应用了其他机器学习方法,包括支持向量机(SVM)和随机森林(RF)算法来推导模型。此外,还使用了通过边缘滤波从图像中获得的其他特征来训练SVM和RF模型。机器学习能够准确地区分具有不同拓扑和信息内容的3D AL图像变体。总体而言,直接从3D AL图像中学习特征表示的CNN达到了最高的分类精度。对于强调拓扑标高的图像变体,CNN,SVM和RF模型的预测性能最高。另外,在来自边缘滤波的基本图像上训练的SVM模型对此类图像进行了高精度分类,这进一步支持了高度相关的拓扑特征在图像分析和预测中的关键作用。两者合计,我们的概念验证研究的结果表明,图像分析在图形化SAR探索方面具有相当大的潜力,可以从3D AL的拓扑特征系统地推断出不同的SAR特征。这进一步支持了海拔相关的拓扑特征在图像分析和预测中的关键作用。两者合计,我们的概念验证研究的结果表明,图像分析在图形化SAR探索方面具有相当大的潜力,可以从3D AL的拓扑特征系统地推断出不同的SAR特征。这进一步支持了海拔相关的拓扑特征在图像分析和预测中的关键作用。两者合计,我们的概念验证研究的结果表明,图像分析在图形SAR探索中具有很大的潜力,可以从3D AL的拓扑特征系统地推断出不同的SAR特征。
更新日期:2020-05-18
down
wechat
bug