当前位置: X-MOL 学术Comput. Biol. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A bidirectional interpretable compound-protein interaction prediction framework based on cross attention
Computers in Biology and Medicine ( IF 7.0 ) Pub Date : 2024-03-02 , DOI: 10.1016/j.compbiomed.2024.108239
Meng Wang 1 , Jianmin Wang 2 , Zhiwei Rong 3 , Liuying Wang 1 , Zhenyi Xu 1 , Liuchao Zhang 1 , Jia He 1 , Shuang Li 1 , Lei Cao 1 , Yan Hou 3 , Kang Li 1
Affiliation  

The identification of compound-protein interactions (CPIs) plays a vital role in drug discovery. However, the huge cost and labor-intensive nature in vitro and vivo experiments make it urgent for researchers to develop novel CPI prediction methods. Despite emerging deep learning methods have achieved promising performance in CPI prediction, they also face ongoing challenges: (i) providing bidirectional interpretability from both the chemical and biological perspective for the prediction results; (ii) comprehensively evaluating model generalization performance; (iii) demonstrating the practical applicability of these models. To overcome the challenges posed by current deep learning methods, we propose a cross multi-head attention oriented bidirectional interpretable CPI prediction model (CmhAttCPI). First, CmhAttCPI takes molecular graphs and protein sequences as inputs, utilizing the GCW module to learn atom features and the CNN module to learn residue features, respectively. Second, the model applies cross multi-head attention module to compute attention weights for atoms and residues. Finally, CmhAttCPI employs a fully connected neural network to predict scores for CPIs. We evaluated the performance of CmhAttCPI on balanced datasets and imbalanced datasets. The results consistently show that CmhAttCPI outperforms multiple state-of-the-art methods. We constructed three scenarios based on compound and protein clustering and comprehensively evaluated the model generalization ability within these scenarios. The results demonstrate that the generalization ability of CmhAttCPI surpasses that of other models. Besides, the visualizations of attention weights reveal that CmhAttCPI provides chemical and biological interpretation for CPI prediction. Moreover, case studies confirm the practical applicability of CmhAttCPI in discovering anticancer candidates.

中文翻译:


基于交叉注意力的双向可解释化合物-蛋白质相互作用预测框架



化合物-蛋白质相互作用 (CPI) 的识别在药物发现中起着至关重要的作用。然而,体外和体内实验的巨大成本和劳动密集性使得研究人员迫切需要开发新的 CPI 预测方法。尽管新兴的深度学习方法在CPI预测中取得了可喜的表现,但它们也面临着持续的挑战:(i)从化学和生物学角度为预测结果提供双向可解释性; (ii) 综合评估模型泛化性能; (iii) 证明这些模型的实际适用性。为了克服当前深度学习方法带来的挑战,我们提出了一种面向交叉多头注意力的双向可解释 CPI 预测模型(CmhAttCPI)。首先,CmhAttCPI以分子图和蛋白质序列作为输入,分别利用GCW模块学习原子特征和CNN模块学习残基特征。其次,该模型应用交叉多头注意力模块来计算原子和残基的注意力权重。最后,CmhAttCPI 采用全连接神经网络来预测 CPI 分数。我们评估了 CmhAttCPI 在平衡数据集和不平衡数据集上的性能。结果一致表明,CmhAttCPI 优于多种最先进的方法。我们构建了基于化合物和蛋白质聚类的三个场景,并综合评估了这些场景下的模型泛化能力。结果表明,CmhAttCPI 的泛化能力优于其他模型。此外,注意力权重的可视化表明,CmhAttCPI 为 CPI 预测提供了化学和生物学解释。 此外,案例研究证实了 CmhAttCPI 在发现抗癌候选药物方面的实际适用性。
更新日期:2024-03-02
down
wechat
bug