当前位置: X-MOL 学术mSystems › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage
mSystems ( IF 6.4 ) Pub Date : 2020-10-27 , DOI: 10.1128/msystems.00833-20
Patrick Willems 1 , Igor Fijalkowski 1 , Petra Van Damme 2
Affiliation  

Prokaryotic genome annotation is heavily dependent on automated gene annotation pipelines that are prone to propagate errors and underestimate genome complexity. We describe an optimized proteogenomic workflow that uses ribosome profiling (ribo-seq) and proteomic data for Salmonella enterica serovar Typhimurium to identify unannotated proteins or alternative protein forms. This data analysis encompasses the searching of cofragmenting peptides and postprocessing with extended peptide-to-spectrum quality features, including comparison to predicted fragment ion intensities. When this strategy is applied, an enhanced proteome depth is achieved, as well as greater confidence for unannotated peptide hits. We demonstrate the general applicability of our pipeline by reanalyzing public Deinococcus radiodurans data sets. Taken together, our results show that systematic reanalysis using available prokaryotic (proteome) data sets holds great promise to assist in experimentally based genome annotation.

中文翻译:

失物招领:重新搜索和重新评分蛋白质组学数据有助于基因组注释并提高蛋白质组覆盖率

原核基因组注释在很大程度上依赖于易于传播错误和低估基因组复杂性的自动化基因注释管道。我们描述了一个优化的蛋白质基因组学工作流程,该工作流程使用核糖体分析 (ribo-seq) 和蛋白质组学数据来识别肠道沙门氏菌伤寒沙门氏菌血清型,以识别未注释的蛋白质或替代蛋白质形式。该数据分析包括搜索共碎片肽段和具有扩展肽谱质量特征的后处理,包括与预测碎片离子强度的比较。应用此策略后,可实现增强的蛋白质组深度,并提高对未注释肽段命中率的置信度。我们通过重新分析公众来证明我们的管道的普遍适用性耐辐射奇异球菌数据集。综上所述,我们的结果表明,使用可用的原核(蛋白质组)数据集进行系统的再分析对于协助基于实验的基因组注释具有很大的帮助。
更新日期:2020-10-28
down
wechat
bug