当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Seq-BEL: Sequence-Based Ensemble Learning for Predicting Virus-Human Protein-Protein Interaction
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 4.5 ) Pub Date : 2020-07-09 , DOI: 10.1109/tcbb.2020.3008157
Yingjun Ma , Tingting He , Yu-Ting Tan , xingpeng jiang

Infectious diseases are currently the most important and widespread health problem, and identifying viral infection mechanisms is critical for controlling diseases caused by highly infectious viruses. Because of the lack of non-interactive protein pairs and serious imbalance between positive and negative sample ratios, the supervised learning algorithm is not suitable for prediction. At the same time, due to the lack of information on viral proteins and significant dissimilarity in sequence, some ensemble learning models have poor generalization ability. In this paper, we propose a Sequence-Based Ensemble Learning (Seq-BEL) method to predict the potential virus-human PPIs. Specifically, based on the amino acid sequence of proteins and the currently known virus-human PPI network, Seq-BEL calculates various features and similarities of human proteins and viral proteins, and then combines these similarities and features to score the potential of virus-human PPIs. The computational results show that Seq-BEL achieves success in predicting potential virus-human PPIs and outperforms other state-of-the-art methods. More importantly, Seq-BEL also has good predictive performance for new human proteins and new viral proteins. In addition, the model has the advantages of strong robustness and good generalization ability, and can be used as an effective tool for virus-human PPI prediction.

中文翻译:

Seq-BEL:用于预测病毒-人类蛋白质-蛋白质相互作用的基于序列的集成学习

传染病是当前最重要和最普遍的健康问题,确定病毒感染机制对于控制由高传染性病毒引起的疾病至关重要。由于缺乏非交互蛋白质对,正负样本比例严重失衡,监督学习算法不适合预测。同时,由于缺乏病毒蛋白信息,序列差异显着,一些集成学习模型泛化能力较差。在本文中,我们提出了一种基于序列的集成学习 (Seq-BEL) 方法来预测潜在的病毒-人类 PPI。具体来说,基于蛋白质的氨基酸序列和目前已知的病毒-人PPI网络,Seq-BEL 计算人类蛋白质和病毒蛋白质的各种特征和相似性,然后结合这些相似性和特征对病毒-人类 PPI 的潜力进行评分。计算结果表明,Seq-BEL 在预测潜在的病毒-人类 PPI 方面取得了成功,并且优于其他最先进的方法。更重要的是,Seq-BEL 对新的人类蛋白和新的病毒蛋白也有很好的预测性能。此外,该模型具有鲁棒性强、泛化能力强等优点,可作为病毒-人类PPI预测的有效工具。计算结果表明,Seq-BEL 在预测潜在的病毒-人类 PPI 方面取得了成功,并且优于其他最先进的方法。更重要的是,Seq-BEL 对新的人类蛋白和新的病毒蛋白也有很好的预测性能。此外,该模型具有鲁棒性强、泛化能力强等优点,可作为病毒-人类PPI预测的有效工具。计算结果表明,Seq-BEL 在预测潜在的病毒-人类 PPI 方面取得了成功,并且优于其他最先进的方法。更重要的是,Seq-BEL 对新的人类蛋白和新的病毒蛋白也有很好的预测性能。此外,该模型具有鲁棒性强、泛化能力强等优点,可作为病毒-人类PPI预测的有效工具。
更新日期:2020-07-09
down
wechat
bug