当前位置: X-MOL 学术Curr. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving Multi-type Gram-negative Bacterial Secreted Protein Prediction via Protein Evolutionary Information and Feature Ranking
Current Bioinformatics ( IF 2.4 ) Pub Date : 2020-06-30 , DOI: 10.2174/1574893614666190730105629
Liang Kong 1 , Lichao Zhang 2 , Shiqian He 1
Affiliation  

Background: Gram-negative bacteria interact with their environment by secreting a wide range of particular substrates (such as proteins) across two lipid bilayers from the cytoplasm to the extracellular space. Determining the types of secreted proteins is beneficial for further research on secreted proteins and secretion systems.

Objective: As an essential alternative for experimental methods, an accurate machine learningbased multi-type Gram-negative bacterial secreted protein prediction method was proposed in this study.

Methods: The main contribution is combining auto-cross-correlation analysis and feature ranking technology to build an effective support vector machine-based multi-type Gram-negative bacterial secreted protein predictor. The specifically designed auto-cross-correlation descriptor can capture evolutionary correlation information between amino acid pairs along protein sequence from position specific scoring matrices. Feature ranking technique was used to analyze and select the most informative features for building prediction model.

Results: Several kinds of prediction accuracies obtained by independent dataset test are reported on two benchmark datasets. Compared with the state-of-the-art prediction methods, the proposed method improves overall accuracies by 2.91% and 2.25%, respectively.

Conclusion: Our study will provide an important guide to utilize protein evolutionary information for further research on bacterial secreted proteins.



中文翻译:

通过蛋白质进化信息和特征分级改进多型革兰氏阴性细菌分泌的蛋白质预测

背景:革兰氏阴性细菌通过从细胞质到细胞外空间的两个脂质双层分泌大量特定的底物(例如蛋白质)与环境相互作用。确定分泌蛋白的类型有利于进一步研究分泌蛋白和分泌系统。

目的:作为实验方法的必要替代方法,本研究提出了一种基于精确机器学习的多型革兰氏阴性细菌分泌蛋白预测方法。

方法:主要贡献是将自相关分析与特征排序技术相结合,构建了基于支持向量机的多型革兰氏阴性细菌分泌蛋白预测因子。专门设计的自动互相关描述符可以从位置特定的评分矩阵沿蛋白质序列捕获氨基酸对之间的进化相关信息。特征排序技术被用来分析和选择信息量最大的特征以建立预测模型。

结果:在两个基准数据集上报告了通过独立数据集测试获得的几种预测精度。与最新的预测方法相比,该方法将总体准确度分别提高了2.91%和2.25%。

结论:我们的研究将为利用蛋白质进化信息进一步研究细菌分泌的蛋白质提供重要的指导。

更新日期:2020-06-30
down
wechat
bug