当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Protein sequence information extraction and subcellular localization prediction with gapped k-Mer method.
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2019-12-30 , DOI: 10.1186/s12859-019-3232-4
Yu-Hua Yao 1, 2 , Ya-Ping Lv 1 , Ling Li 3 , Hui-Min Xu 2 , Bin-Bin Ji 1 , Jing Chen 2 , Chun Li 1 , Bo Liao 1 , Xu-Ying Nan 4
Affiliation  

BACKGROUND Subcellular localization prediction of protein is an important component of bioinformatics, which has great importance for drug design and other applications. A multitude of computational tools for proteins subcellular location have been developed in the recent decades, however, existing methods differ in the protein sequence representation techniques and classification algorithms adopted. RESULTS In this paper, we firstly introduce two kinds of protein sequences encoding schemes: dipeptide information with space and Gapped k-mer information. Then, the Gapped k-mer calculation method which is based on quad-tree is also introduced. CONCLUSIONS >From the prediction results, this method not only reduces the dimension, but also improves the prediction precision of protein subcellular localization.

中文翻译:

利用空位k-Mer方法提取蛋白质序列信息并进行亚细胞定位预测。

背景技术蛋白质的亚细胞定位预测是生物信息学的重要组成部分,对于药物设计和其他应用具有重要意义。近几十年来,已经开发出了许多用于蛋白质亚细胞定位的计算工具,但是,现有的方法在蛋白质序列表示技术和采用的分类算法方面有所不同。结果在本文中,我们首先介绍了两种蛋白质序列编码方案:带空间的二肽信息和有间隙的k聚体信息。然后,介绍了基于四叉树的gapped k-mer计算方法。结论>从预测结果来看,该方法不仅减小了维数,而且提高了蛋白质亚细胞定位的预测精度。
更新日期:2019-12-30
down
wechat
bug