当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
NegStacking: Drug−Target Interaction Prediction Based on Ensemble Learning and Logistic Regression
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 3.6 ) Pub Date : 2020-01-22 , DOI: 10.1109/tcbb.2020.2968025
Jie Yang , Song He , Zhongnan Zhang , Xiaochen Bo

Drug−target interactions (DTIs) identification is an important issue of drug research, and many methods proposed to predict potential DTIs based on machine learning treat it as a binary classification problem. However, the number of known interacting drug−target pairs (positive samples) is far less than that of non-interacting pairs (negative samples). Most methods do not utilize these large numbers of negative samples sufficiently, which limits their prediction performance. To address this problem, we proposed a stacking framework named NegStacking. First, it uses sampling to obtain multiple completely different negative sample sets. Then, each weak learner is trained with a different negative sample set and the same positive sample set, and the logistic regression (LR) is used as a meta-learner to adaptively combine these weak learners. Moreover, in the training process, feature subspacing and hyperparameter perturbation are applied to increase ensemble diversity. Finally, the trained model could be used to predict new samples. We compared NegStacking with other methods, and the experimental results show that our model is superior. NegStacking can improve the performance of predictive DTIs, and it has broad application prospects for improving the drug discovery process. The source code and datasets are available at https://github.com/Open-ss/NegStacking .

中文翻译:

NegStacking:基于集成学习和逻辑回归的药物-目标相互作用预测

药物-靶标相互作用(DTI)识别是药物研究的一个重要问题,许多基于机器学习预测潜在 DTI 的方法将其视为二元分类问题。然而,已知相互作用的药物靶点对(阳性样本)的数量远远少于非相互作用对(阴性样本)的数量。大多数方法没有充分利用这些大量的负样本,这限制了它们的预测性能。为了解决这个问题,我们提出了一个名为 NegStacking 的堆叠框架。首先,它使用采样来获得多个完全不同的负样本集。然后,用不同的负样本集和相同的正样本集训练每个弱学习器,并使用逻辑回归(LR)作为元学习器来自适应地组合这些弱学习器。而且,在训练过程中,应用特征子空间和超参数扰动来增加集成多样性。最后,训练后的模型可用于预测新样本。我们将 NegStacking 与其他方法进行了比较,实验结果表明我们的模型更优越。NegStacking 可以提高预测性 DTI 的性能,在改进药物发现过程方面具有广阔的应用前景。源代码和数据集可在 在改进药物发现过程方面具有广阔的应用前景。源代码和数据集可在 在改进药物发现过程方面具有广阔的应用前景。源代码和数据集可在https://github.com/Open-ss/NegStacking .
更新日期:2020-01-22
down
wechat
bug