当前位置: X-MOL 学术Acta Biotheor. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction of Apoptosis Protein’s Subcellular Localization by Fusing Two Different Descriptors Based on Evolutionary Information
Acta Biotheoretica ( IF 1.4 ) Pub Date : 2018-03-01 , DOI: 10.1007/s10441-018-9319-x
Yunyun Liang 1 , Shengli Zhang 2
Affiliation  

The apoptosis protein has a central role in the development and the homeostasis of an organism. Obtaining information about the subcellular localization of apoptosis protein is very helpful to understand the apoptosis mechanism and the function of this protein. Prediction of apoptosis protein’s subcellular localization is a challenging task, and currently the existing feature extraction methods mainly rely on the protein’s primary sequence. In this paper we develop a feature extraction model based on two different descriptors of evolutionary information, which contains the 192 frequencies of triplet codons (FTC) in the RNA sequence derived from the protein’s primary sequence and the 190 features from a detrended forward moving-average cross-correlation analysis (DFMCA) based on a position-specific scoring matrix (PSSM) generated by the PSI-BLAST program. Hence, this model is called FTC-DFMCA-PSSM. A 382-dimensional (382D) feature vector is constructed on the ZD98, ZW225 and CL317 datasets. Then a support vector machine is adopted as classifier, and the jackknife cross-validation test method is used for evaluating the accuracy. The overall prediction accuracies are further improved by an objective and rigorous jackknife test. Our model not only broadens the source of the feature information, but also provides a more accurate and reliable automated calculation method for the prediction of apoptosis protein’s subcellular localization.

中文翻译:

基于进化信息融合两种不同描述子预测凋亡蛋白亚细胞定位

细胞凋亡蛋白在生物体的发育和稳态中具有核心作用。获得有关凋亡蛋白亚细胞定位的信息,对于了解该蛋白的凋亡机制和功能非常有帮助。预测凋亡蛋白的亚细胞定位是一项具有挑战性的任务,目前现有的特征提取方法主要依赖于蛋白质的一级序列。在本文中,我们基于两种不同的进化信息描述符开发了一个特征提取模型,其中包含来自蛋白质一级序列的 RNA 序列中三联密码子 (FTC) 的 192 个频率和基于位置特定评分矩阵 (PSSM) 的去趋势正向移动平均互相关分析 (DFMCA) 的 190 个特征由 PSI-BLAST 程序生成。因此,这个模型被称为 FTC-DFMCA-PSSM。在 ZD98、ZW225 和 CL317 数据集上构建了一个 382 维(382D)的特征向量。然后采用支持向量机作为分类器,采用折刀交叉验证测试方法评估准确率。客观而严格的折刀测试进一步提高了整体预测精度。我们的模型不仅拓宽了特征信息的来源,
更新日期:2018-03-01
down
wechat
bug