当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction of Drug–Target Interactions Based on Network Representation Learning and Ensemble Learning
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 3.6 ) Pub Date : 2020-04-23 , DOI: 10.1109/tcbb.2020.2989765
Ping Xuan , Bingxu Chen , Tiangang Zhang , Yan Yang

Identifying interactions between drugs and target proteins is a critical step in the drug development process, as it helps identify new targets for drugs and accelerate drug development. The number of known drug–protein interactions (positive samples) is much lower than that of the unknown ones (negative samples), which forms a class imbalance. Most previous methods only utilised part of the negative samples to train the prediction model, so most of the information on negative samples was neglected. Therefore, a new method must be developed to predict candidate drug–related proteins and fully utilise negative samples to improve prediction performance. We present a method based on non-negative matrix factorisation and gradient boosting decision tree (GBDT), named NGDTP, to identify the candidate drug–protein interactions. NGDTP integrates multiple kinds of protein similarities, drugs–proteins interactions, and multiple kinds of drugs similarities at different levels, including target proteins of drugs, drug-related diseases, and side effects of drugs. We propose a network representation learning method based on matrix factorisation to learn low-dimensional vector representations of drug and protein nodes. On the basis of these low-dimensional node representations, a GBDT-based prediction model was constructed and it obtains the association scores through establishing multiple decision trees for a drug–protein pairs. NGDTP is an ensemble learning model that fully utilises all the negative samples to effectively alleviate the problem of class imbalance. NGDTP achieves superior prediction performance when it is compared with several state-of-the-art methods. The experimental results indicate that NGDTP also retrieves more actual drug-protein interactions in the top part of prediction result, which drew significant attention from the biologists. In addition, case studies on 10 drugs further confirmed the ability of the NGDTP to identify potential candidate proteins for drugs.

中文翻译:

基于网络表示学习和集成学习的药物-靶标相互作用预测

识别药物和靶蛋白之间的相互作用是药物开发过程中的关键步骤,因为它有助于识别药物的新靶点并加速药物开发。已知的药物-蛋白质相互作用(阳性样本)的数量远低于未知的(阴性样本),这形成了类别不平衡。以往的方法大多只利用部分负样本来训练预测模型,因此忽略了负样本的大部分信息。因此,必须开发一种新的方法来预测候选药物相关蛋白,并充分利用负样本来提高预测性能。我们提出了一种基于非负矩阵分解和梯度提升决策树 (GBDT) 的方法,称为 NGDTP,以识别候选药物-蛋白质相互作用。NGDTP在不同层面整合了多种蛋白质相似性、药物-蛋白质相互作用、多种药物相似性,包括药物靶蛋白、药物相关疾病、药物副作用等。我们提出了一种基于矩阵分解的网络表示学习方法来学习药物和蛋白质节点的低维向量表示。在这些低维节点表示的基础上,构建了一个基于GBDT的预测模型,并通过为药物-蛋白质对建立多个决策树来获得关联分数。NGDTP是一种充分利用所有负样本来有效缓解类不平衡问题的集成学习模型。与几种最先进的方法相比,NGDTP 实现了卓越的预测性能。实验结果表明,NGDTP在预测结果的顶部也检索到了更多的实际药物-蛋白质相互作用,这引起了生物学家的极大关注。此外,对 10 种药物的案例研究进一步证实了 NGDTP 识别药物潜在候选蛋白的能力。
更新日期:2020-04-23
down
wechat
bug