当前位置: X-MOL 学术Nat. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting drug–protein interaction using quasi-visual question answering system
Nature Machine Intelligence ( IF 23.8 ) Pub Date : 2020-02-14 , DOI: 10.1038/s42256-020-0152-y
Shuangjia Zheng , Yongjian Li , Sheng Chen , Jun Xu , Yuedong Yang

Identifying novel drug–protein interactions is crucial for drug discovery. For this purpose, many machine learning-based methods have been developed based on drug descriptors and one-dimensional protein sequences. However, protein sequences cannot accurately reflect the interactions in three-dimensional space. However, direct input of three-dimensional structure is of low efficiency due to the sparse three-dimensional matrix, and is also prevented by the limited number of co-crystal structures available for training. Here we propose an end-to-end deep learning framework to predict the interactions by representing proteins with a two-dimensional distance map from monomer structures (Image) and drugs with molecular linear notation (String), following the visual question answering mode. For efficient training of the system, we introduce a dynamic attentive convolutional neural network to learn fixed-size representations from the variable-length distance maps and a self-attentional sequential model to automatically extract semantic features from the linear notations. Extensive experiments demonstrate that our model obtains competitive performance against state-of-the-art baselines on the directory of useful decoys, enhanced (DUD-E), human and BindingDB benchmark datasets. Further attention visualization provides biological interpretation to depict highlighted regions of both protein and drug molecules.



中文翻译:

使用准视觉问答系统预测药物-蛋白质相互作用

鉴定新的药物-蛋白质相互作用对于发现药物至关重要。为此,已经基于药物描述符和一维蛋白质序列开发了许多基于机器学习的方法。但是,蛋白质序列不能准确反映三维空间中的相互作用。然而,由于稀疏的三维矩阵,三维结构的直接输入效率低,并且由于可用于训练的有限数量的共晶体结构而被阻止。在这里,我们提出了一种端到端的深度学习框架,通过遵循可视问题回答模式,通过从单体结构(图像)和具有分子线性符号(字符串)的药物以二维距离图表示蛋白质来预测相互作用。为了有效地培训系统,我们引入了动态注意力卷积神经网络以从可变长度距离图中学习固定大小的表示形式,并引入了一种自注意顺序模型以从线性符号中自动提取语义特征。大量实验表明,我们的模型在有用的诱饵,增强(DUD-E),人类和BindingDB基准数据集目录上相对于最新基准具有竞争性性能。进一步的注意力可视化提供了生物学解释,以描绘蛋白质和药物分子的突出区域。大量实验表明,我们的模型在有用的诱饵,增强(DUD-E),人类和BindingDB基准数据集目录上相对于最新基准具有竞争性性能。进一步的注意力可视化提供了生物学解释,以描绘蛋白质和药物分子的突出区域。大量实验表明,我们的模型在有用的诱饵,增强(DUD-E),人类和BindingDB基准数据集目录上相对于最新基准具有竞争性性能。进一步的注意力可视化提供了生物学解释,以描绘蛋白质和药物分子的突出区域。

更新日期:2020-02-14
down
wechat
bug