当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deleterious Non-Synonymous Single Nucleotide Polymorphism Predictions on Human Transcription Factors.
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 3.6 ) Pub Date : 2018-11-21 , DOI: 10.1109/tcbb.2018.2882548
Ka-Chun Wong , Shankai Yan , Qiuzhen Lin , Xiangtao Li , Chengbin Peng

Transcription factors (TFs) are the major components of human gene regulation. In particular, they bind onto specific DNA sequences and regulate neighborhood genes in different tissues at different developmental stages. Non-synonymous single nucleotide polymorphisms on its protein-coding sequences could result in undesired consequences in human. Therefore, it is necessary to develop methods for predicting any abnormality among those non-synonymous single nucleotide polymorphisms. To address it, we have developed and compared different strategies to predict deleterious non-synonymous single nucleotide polymorphisms (also known as missense mutations) on the protein-coding sequences of human TFs. Taking advantage of evolutionary conservation signals, we have developed and compared different classifiers with different feature sets as computed from different evolutionarily related sequence collections. The results indicate that the classic ensemble algorithm, Adaboost with decision stumps, with orthologous sequence collection, has performed the best (namely, TFmedic). We have further compared TFmedic with other state-of-the-arts methods (i.e., PolyPhen-2 and SIFT) on PolyPhen-2's own datasets, demonstrating that TFmedic can outperform the others. As applications, we have further applied TFmedic to all possible missense mutations on all human transcription factors; the proteome-wide results reveal interesting insights, consistent with the existing physiochemical knowledge. A case study with the actual 3D structure is conducted, revealing how TFmedic can be contributed to protein-DNA binding complex studies.

中文翻译:

对人类转录因子的有害的非同义单核苷酸多态性预测。

转录因子(TFs)是人类基因调控的主要组成部分。特别地,它们结合到特定的DNA序列上并在不同的发育阶段调节不同组织中的邻域基因。其蛋白质编码序列上的非同义单核苷酸多态性可能对人类造成不良后果。因此,有必要开发用于预测那些非同义单核苷酸多态性中的任何异常的方法。为了解决这个问题,我们已经开发并比较了不同的策略来预测人类TF的蛋白质编码序列上的有害非同义单核苷酸多态性(也称为错义突变)。利用进化保护信号 我们已经开发并比较了具有不同特征集的不同分类器,这些特征集是根据不同的进化相关序列集合计算得出的。结果表明,经典的集成算法,带有决策树桩的Adaboost,具有直系同源序列,表现最佳(即TFmedic)。我们进一步将TFmedic与PolyPhen-2自己的数据集上的其他最新方法(即PolyPhen-2和SIFT)进行了比较,证明TFmedic可以胜过其他方法。作为应用,我们进一步将TFmedic应用于所有人类转录因子上的所有可能的错义突变。蛋白质组范围的结果揭示了有趣的见解,与现有的理化知识相一致。进行了具有实际3D结构的案例研究,
更新日期:2020-03-07
down
wechat
bug