当前位置: X-MOL 学术SAR QSAR Environ. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An ensemble method for multi-type Gram-negative bacterial secreted protein prediction by integrating different PSSM-based features.
SAR and QSAR in Environmental Research ( IF 3 ) Pub Date : 2019-02-11 , DOI: 10.1080/1062936x.2019.1573438
L Kong 1 , L Zhang 2, 3
Affiliation  

In Gram-negative bacteria, a wide range of proteins are secreted by highly specialized secretion systems. These secreted proteins play essential roles in the response of bacteria to their environment and also in several physiological processes such as adhesion, pathogenicity, adaptation and survival. Therefore, identifying secreted proteins in Gram-negative bacteria may assist in understanding the secretion mechanism and development of new antimicrobial strategies. Considering that a single-feature model is less likely to comprehensively cover this information, three kinds of feature models were used in this paper to represent protein samples by composition analysis, correlation analysis and smoothing encoding method on position-specific scoring matrix profiles. A support vector machine-based ensemble method with these hybrid features was developed to predict multi-type Gram-negative bacterial secreted proteins. Finally, our method achieves overall accuracies of 97.09% and 96.51% using an independent dataset test and jackknife test on a public test dataset, which are 3.49% and 2.32% higher, respectively, than results obtained by other methods. These results show the effectiveness and stability of the proposed ensemble method. It is anticipated that our method will provide useful information for further research on bacterial secreted proteins and secreted systems.



中文翻译:

通过集成基于PSSM的不同功能来预测多种革兰氏阴性细菌分泌蛋白的整体方法。

在革兰氏阴性细菌中,高度专业化的分泌系统分泌多种蛋白质。这些分泌的蛋白质在细菌对环境的反应中以及在某些生理过程(例如粘附,致病性,适应性和存活)中起着至关重要的作用。因此,鉴定革兰氏阴性细菌中的分泌蛋白可能有助于理解分泌机制和开发新的抗菌策略。考虑到单一特征模型不太可能全面覆盖此信息,本文使用三种特征模型通过针对位置特定得分矩阵配置文件的成分分析,相关性分析和平滑编码方法来表示蛋白质样品。具有这些杂种特征的基于支持向量机的集成方法被开发来预测多种革兰氏阴性细菌分泌蛋白。最后,我们的方法通过在公共测试数据集上进行独立的数据集测试和折刀测试,获得了97.09%和96.51%的总体准确度,分别比其他方法获得的结果高3.49%和2.32%。这些结果表明了该集成方法的有效性和稳定性。预计我们的方法将为细菌分泌的蛋白质和分泌系统的进一步研究提供有用的信息。分别比其他方法获得的结果要好。这些结果表明了该集成方法的有效性和稳定性。预计我们的方法将为细菌分泌的蛋白质和分泌系统的进一步研究提供有用的信息。分别比其他方法获得的结果要好。这些结果表明了该集成方法的有效性和稳定性。预期我们的方法将为细菌分泌的蛋白质和分泌系统的进一步研究提供有用的信息。

更新日期:2019-02-11
down
wechat
bug