当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep learning integration of molecular and interactome data for protein–compound interaction prediction
Journal of Cheminformatics ( IF 7.1 ) Pub Date : 2021-05-01 , DOI: 10.1186/s13321-021-00513-3
Narumi Watanabe 1 , Yuuto Ohnuki 1 , Yasubumi Sakakibara 1
Affiliation  

Virtual screening, which can computationally predict the presence or absence of protein–compound interactions, has attracted attention as a large-scale, low-cost, and short-term search method for seed compounds. Existing machine learning methods for predicting protein–compound interactions are largely divided into those based on molecular structure data and those based on network data. The former utilize information on proteins and compounds, such as amino acid sequences and chemical structures; the latter rely on interaction network data, such as protein–protein interactions and compound–compound interactions. However, there have been few attempts to combine both types of data in molecular information and interaction networks. We developed a deep learning-based method that integrates protein features, compound features, and multiple types of interactome data to predict protein–compound interactions. We designed three benchmark datasets with different difficulties and applied them to evaluate the prediction method. The performance evaluations show that our deep learning framework for integrating molecular structure data and interactome data outperforms state-of-the-art machine learning methods for protein–compound interaction prediction tasks. The performance improvement is statistically significant according to the Wilcoxon signed-rank test. This finding reveals that the multi-interactome data captures perspectives other than amino acid sequence homology and chemical structure similarity and that both types of data synergistically improve the prediction accuracy. Furthermore, experiments on the three benchmark datasets show that our method is more robust than existing methods in accurately predicting interactions between proteins and compounds that are unseen in training samples.

中文翻译:

用于蛋白质-化合物相互作用预测的分子和相互作用组数据的深度学习集成

虚拟筛选可以通过计算预测蛋白质-化合物相互作用的存在与否,作为一种大规模、低成本和短期的种子化合物搜索方法,已经引起了人们的关注。现有的预测蛋白质-化合物相互作用的机器学习方法主要分为基于分子结构数据的方法和基于网络数据的方法。前者利用蛋白质和化合物的信息,例如氨基酸序列和化学结构;后者依赖于相互作用网络数据,例如蛋白质-蛋白质相互作用和化合物-化合物相互作用。然而,很少有人尝试在分子信息和相互作用网络中结合这两种类型的数据。我们开发了一种基于深度学习的方法,它集成了蛋白质特征、复合特征、以及多种类型的相互作用组数据来预测蛋白质-化合物的相互作用。我们设计了三个具有不同难度的基准数据集,并将它们用于评估预测方法。性能评估表明,我们用于整合分子结构数据和相互作用组数据的深度学习框架优于用于蛋白质-化合物相互作用预测任务的最先进的机器学习方法。根据 Wilcoxon 符号秩检验,性能改进在统计上是显着的。这一发现表明,多相互作用组数据捕获了氨基酸序列同源性和化学结构相似性以外的观点,并且两种类型的数据协同提高了预测准确性。此外,
更新日期:2021-05-03
down
wechat
bug