当前位置: X-MOL 学术Bioinformatics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accurate peptide fragmentation predictions allow data driven approaches to replace and improve upon proteomics search engine scoring functions.
Bioinformatics ( IF 4.4 ) Pub Date : 2019-12-15 , DOI: 10.1093/bioinformatics/btz383
Ana S C Silva 1, 2, 3 , Robbin Bouwmeester 1, 2, 3 , Lennart Martens 1, 2, 3 , Sven Degroeve 1, 2, 3
Affiliation  

MOTIVATION The use of post-processing tools to maximize the information gained from a proteomics search engine is widely accepted and used by the community, with the most notable example being Percolator-a semi-supervised machine learning model which learns a new scoring function for a given dataset. The usage of such tools is however bound to the search engine's scoring scheme, which doesn't always make full use of the intensity information present in a spectrum. We aim to show how this tool can be applied in such a way that maximizes the use of spectrum intensity information by leveraging another machine learning-based tool, MS2PIP. MS2PIP predicts fragment ion peak intensities. RESULTS We show how comparing predicted intensities to annotated experimental spectra by calculating direct similarity metrics provides enough information for a tool such as Percolator to accurately separate two classes of peptide-to-spectrum matches. This approach allows using more information out of the data (compared with simpler intensity based metrics, like peak counting or explained intensities summing) while maintaining control of statistics such as the false discovery rate. AVAILABILITY AND IMPLEMENTATION All of the code is available online at https://github.com/compomics/ms2rescore. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

中文翻译:

准确的肽片段预测可以实现数据驱动的方法来替换和改进蛋白质组学搜索引擎的评分功能。

动机社区广泛接受并使用后处理工具来最大化从蛋白质组学搜索引擎获得的信息,其中最著名的例子是Percolator-一种半监督的机器学习模型,该模型可为机器学习新的评分功能。给定的数据集。但是,此类工具的使用受限于搜索引擎的评分方案,该方案并不总是充分利用频谱中存在的强度信息。我们旨在展示如何通过利用另一种基于机器学习的工具MS2PIP来以最大程度地利用频谱强度信息的方式来应用此工具。MS2PIP预测碎片离子峰强度。结果我们展示了如何通过计算直接相似性指标将预测强度与带注释的实验光谱进行比较,从而为诸如Percolator之类的工具提供了足够的信息,以准确地分离出两类肽谱匹配。这种方法允许在数据中使用更多信息(与基于强度的更简单的度量标准(例如峰值计数或解释的强度求和)相比),同时保持对统计信息(例如错误发现率)的控制。可用性和实现所有代码都可以从https://github.com/compomics/ms2rescore在线获得。补充信息补充数据可从Bioinformatics在线获得。这种方法允许在数据中使用更多信息(与基于强度的更简单的度量标准(例如峰值计数或解释的强度求和)相比),同时保持对统计信息(例如错误发现率)的控制。可用性和实现所有代码都可以从https://github.com/compomics/ms2rescore在线获得。补充信息补充数据可从Bioinformatics在线获得。这种方法允许在数据中使用更多信息(与基于强度的更简单的度量标准(例如峰值计数或解释的强度求和)相比),同时保持对统计信息(例如错误发现率)的控制。可用性和实现所有代码都可以从https://github.com/compomics/ms2rescore在线获得。补充信息补充数据可从Bioinformatics在线获得。
更新日期:2020-01-13
down
wechat
bug