当前位置: X-MOL 学术Front. Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
piEnPred: a bi-layered discriminative model for enhancers and their subtypes via novel cascade multi-level subset feature selection algorithm
Frontiers of Computer Science ( IF 3.4 ) Pub Date : 2021-06-28 , DOI: 10.1007/s11704-020-9504-3
Zaheer Ullah Khan , Dechang Pi , Shuanglong Yao , Asif Nawaz , Farman Ali , Shaukat Ali

Enhancers are short DNA cis-elements that can be bound by proteins (activators) to increase the possibility that transcription of a particular gene will occur. The Enhancers perform a significant role in the formation of proteins and regulating the gene transcription process. Human diseases such as cancer, inflammatory bowel disease, Parkinson’s, addiction, and schizophrenia are due to genetic variation in enhancers. In the current study, we have made an effort by building, a more robust and novel computational a bi-layered model. The representative feature vector was constructed over a linear combination of six features. The optimum Hybrid feature vector was obtained via the Novel Cascade Multi-Level Subset Feature selection (CM-SFS) algorithm. The first layer predicts the enhancer, and the secondary layer carries the prediction of their subtypes. The baseline model obtained 87.88% of accuracy, 95.29% of sensitivity, 80.47% of specificity, 0.766 of MCC, and 0.9603 of a roc value on Layer-1. Similarly, the model obtained 68.24%, 65.54%, 70.95%, 0.3654, and 0.7568 as an Accuracy, sensitivity, specificity, MCC, and ROC values on layer-2 respectively. Over an independent dataset on layer-1, the piEnPred secured 80.4% accuracy, 82.5% of sensitivity, 78.4% of specificity, and 0.6099 as MCC, respectively. Subsequently, the proposed predictor obtained 72.5% of accuracy, 70.0% of sensitivity, 75% of specificity, and 0.4506 of MCC on layer-2, respectively. The proposed method remarkably performed in contrast to other state-of-the-art predictors. For the convenience of most experimental scientists, a user-friendly and publicly freely accessible web server @/bienhancer dot pythonanywhere dot com/has been developed.



中文翻译:

piEnPred:通过新颖的级联多级子集特征选择算法,增强子及其子类型的双层判别模型

增强子是短的 DNA 顺式元件,可以与蛋白质(激活剂)结合以增加特定基因转录的可能性。增强子在蛋白质的形成和调节基因转录过程中发挥着重要作用。人类疾病,如癌症、炎症性肠病、帕金森氏症、成瘾和精神分裂症,都是由于增强子的遗传变异。在当前的研究中,我们努力构建了一个更健壮和新颖的计算双层模型。代表性特征向量是在六个特征的线性组合上构建的。通过新颖的级联多级子集特征选择 (CM-SFS) 算法获得最佳混合特征向量。第一层预测增强子,第二层预测它们的亚型。基线模型在 Layer-1 上获得了 87.88% 的准确度、95.29% 的灵敏度、80.47% 的特异性、0.766 的 MCC 和 0.9603 的 roc 值。同样,该模型在第 2 层上分别获得了 68.24%、65.54%、70.95%、0.3654 和 0.7568 作为准确度、灵敏度、特异性、MCC 和 ROC 值。在第 1 层的独立数据集上,piEnPred 分别获得了 80.4% 的准确度、82.5% 的灵敏度、78.4% 的特异性和 0.6099 的 MCC。随后,所提出的预测器在第 2 层上分别获得了 72.5% 的准确度、70.0% 的灵敏度、75% 的特异性和 0.4506 的 MCC。与其他最先进的预测器相比,所提出的方法表现出色。为了方便大多数实验科学家,

更新日期:2021-06-29
down
wechat
bug