当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Compact Genetic Algorithm-Based Feature Selection for Sequence-Based Prediction of Dengue鈥揌uman Protein Interactions
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 3.6 ) Pub Date : 2021-03-17 , DOI: 10.1109/tcbb.2021.3066597
Lopamudra Dey 1 , Anirban Mukhopadhyay 2
Affiliation  

Dengue Virus (DENV) infection is one of the rapidly spreading mosquito-borne viral infections in humans. Every year, around 50 million people get affected by DENV infection, resulting in 20,000 deaths. Despite the recent experiments focusing on dengue infection to understand its functionality in the human body, several functionally important DENV-human protein-protein interactions (PPIs) have remained unrecognized. This article presents a model for predicting new DENV-human PPIs by combining different sequence-based features of human and dengue proteins like the amino acid composition, dipeptide composition, conjoint triad, pseudo amino acid composition, and pairwise sequence similarity between dengue and human proteins. A Learning vector quantization (LVQ)-based Compact Genetic Algorithm (CGA) model is proposed for feature subset selection. CGA is a probabilistic technique that simulates the behavior of a Genetic Algorithm (GA) with lesser memory and time requirements. Prediction of DENV-human PPIs is performed by the weighted Random Forest (RF) technique as it is found to perform better than other classifiers. We have predicted 1013 PPIs between 335 human proteins and 10 dengue proteins. All predicted interactions are validated by literature filtering, GO-based assessment, and KEGG Pathway enrichment analysis. This study will encourage the identification of potential targets for more effective anti-dengue drug discovery.

中文翻译:


基于紧凑遗传算法的特征选择,用于基于序列的登革热与人类蛋白质相互作用的预测



登革热病毒(DENV)感染是人类中迅速传播的蚊媒病毒感染之一。每年约有 5000 万人受到 DENV 感染,导致 2 万人死亡。尽管最近的实验侧重于登革热感染以了解其在人体内的功能,但一些功能上重要的 DENV-人类蛋白质-蛋白质相互作用 (PPI) 仍未得到认识。本文提出了一种通过结合人类和登革热蛋白的不同基于序列的特征(例如氨基酸组成、二肽组成、联合三联体、伪氨基酸组成以及登革热和人类蛋白之间的成对序列相似性)来预测新的 DENV-人类 PPI 的模型。提出了一种基于学习向量量化(LVQ)的紧凑遗传算法(CGA)模型用于特征子集选择。 CGA 是一种概率技术,可模拟遗传算法 (GA) 的行为,且内存和时间要求较低。 DENV-人类 PPI 的预测是通过加权随机森林 (RF) 技术执行的,因为它被发现比其他分类器表现更好。我们预测了 335 种人类蛋白质和 10 种登革热蛋白质之间的 1013 个 PPI。所有预测的相互作用均通过文献过滤、基于 GO 的评估和 KEGG Pathway 富集分析进行验证。这项研究将鼓励识别潜在靶标,以发现更有效的抗登革热药物。
更新日期:2021-03-17
down
wechat
bug