OctSurf: Efficient hierarchical voxel-based molecular surface representation for protein-ligand affinity prediction,Journal of Molecular Graphics and Modelling

当前位置： X-MOL 学术 › J. Mol. Graph. Model. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

OctSurf: Efficient hierarchical voxel-based molecular surface representation for protein-ligand affinity prediction
Journal of Molecular Graphics and Modelling ( IF 2.7 ) Pub Date : 2021-02-09 , DOI: 10.1016/j.jmgm.2021.107865
Qinqing Liu ₁ , Peng-Shuai Wang ₂ , Chunjiang Zhu ₁ , Blake Blumenfeld Gaines ₁ , Tan Zhu ₁ , Jinbo Bi ₃ , Minghu Song ₄

Affiliation

Voxel-based 3D convolutional neural networks (CNNs) have been applied to predict protein-ligand binding affinity. However, the memory usage and computation cost of these voxel-based approaches increase cubically with respect to spatial resolution and sometimes make volumetric CNNs intractable at higher resolutions. Therefore, it is necessary to develop memory-efficient alternatives that can accelerate the convolutional operation on 3D volumetric representations of the protein-ligand interaction. In this study, we implement a novel volumetric representation, OctSurf, to characterize the 3D molecular surface of protein binding pockets and bound ligands. The OctSurf surface representation is built based on the octree data structure, which has been widely used in computer graphics to efficiently represent and store 3D object data. Vanilla 3D-CNN approaches often divide the 3D space of objects into equal-sized voxels. In contrast, OctSurf recursively partitions the 3D space containing the protein-ligand pocket into eight subspaces called octants. Only those octants containing van der Waals surface points of protein or ligand atoms undergo the recursive subdivision process until they reach the predefined octree depth, whereas unoccupied octants are kept intact to reduce the memory cost. Resulting non-empty leaf octants approximate molecular surfaces of the protein pocket and bound ligands. These surface octants, along with their chemical and geometric features, are used as the input to 3D-CNNs. Two kinds of CNN architectures, VGG and ResNet, are applied to the OctSurf representation to predict binding affinity. The OctSurf representation consumes much less memory than the conventional voxel representation at the same resolution. By restricting the convolution operation to only octants of the smallest size, our method also alleviates the overall computational overhead of CNN. A series of experiments are performed to demonstrate the disk storage and computational efficiency of the proposed learning method. Our code is available at the following GitHub repository: https://github.uconn.edu/mldrugdiscovery/OctSurf.

中文翻译：

OctSurf：有效的基于分层体素的分子表面表示，用于蛋白质-配体亲和力预测

基于体素的3D卷积神经网络（CNN）已应用于预测蛋白质-配体结合亲和力。但是，这些基于体素的方法的内存使用和计算成本相对于空间分辨率呈三次方增加，有时使体积更大的CNN在更高的分辨率下难以处理。因此，有必要开发内存有效的替代方案，以加速对蛋白质-配体相互作用的3D体积表示进行卷积操作。在这项研究中，我们实现了一种新颖的体积表示形式OctSurf，以表征蛋白质结合口袋和结合的配体的3D分子表面。OctSurf表面表示是基于八叉树数据结构构建的，八叉树数据结构已被广泛用于计算机图形学中以有效表示和存储3D对象数据。香草3D-CNN方法通常将对象的3D空间划分为大小相等的体素。相反，OctSurf将包含蛋白质-配体口袋的3D空间递归地划分为八个子空间，称为八分位数。只有那些包含范德华蛋白质或配体原子的表面点经过递归细分过程，直到达到预定的八叉树深度为止，而未占用的八位位点则保持完整以降低存储成本。产生的非空叶八分体近似于蛋白质袋和结合的配体的分子表面。这些表面辛烷值及其化学和几何特征被用作3D-CNN的输入。两种CNN架构VGG和ResNet应用于OctSurf表示以预测绑定亲和力。在相同的分辨率下，OctSurf表示所消耗的内存比常规体素表示要少得多。通过将卷积运算仅限制为最小大小的八分圆，我们的方法还减轻了CNN的总体计算开销。进行了一系列实验，以证明所提出的学习方法的磁盘存储和计算效率。我们的代码可在以下GitHub存储库中找到：https：//github.uconn.edu/mldrugdiscovery/OctSurf。

更新日期：2021-02-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文