Construction of Human Proteoform Families from 21 Tesla Fourier Transform Ion Cyclotron Resonance Mass Spectrometry Top-Down Proteomic Data,Journal of Proteome Research

当前位置： X-MOL 学术 › J. Proteome Res. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Construction of Human Proteoform Families from 21 Tesla Fourier Transform Ion Cyclotron Resonance Mass Spectrometry Top-Down Proteomic Data
Journal of Proteome Research ( IF 3.8 ) Pub Date : 2020-10-19 , DOI: 10.1021/acs.jproteome.0c00403
Leah V Schaffer ₁ , Lissa C Anderson ₂ , David S Butcher ₂ , Michael R Shortreed ₁ , Rachel M Miller ₁ , Caitlin Pavelec ₁ , Lloyd M Smith ₁

Affiliation

Identification of proteoforms, the different forms of a protein, is important to understand biological processes. A proteoform family is the set of different proteoforms from the same gene. We previously developed the software program Proteoform Suite, which constructs proteoform families and identifies proteoforms by intact-mass analysis. Here, we have applied this approach to top-down proteomic data acquired at the National High Magnetic Field Laboratory 21 tesla Fourier transform ion cyclotron resonance mass spectrometer (data available on the MassIVE platform with identifier MSV000085978). We explored the ability to construct proteoform families and identify proteoforms from the high mass accuracy data that this instrument provides for a complex cell lysate sample from the MCF-7 human breast cancer cell line. There were 2830 observed experimental proteforms, of which 932 were identified, 44 were ambiguous, and 1854 were unidentified. Of the 932 unique identified proteoforms, 766 were identified by top-down MS2 analysis at 1% false discovery rate (FDR) using TDPortal, and 166 were additional intact-mass identifications (∼4.7% calculated global FDR) made using Proteoform Suite. We recently published a proteoform level schema to represent ambiguity in proteoform identifications. We implemented this proteoform level classification in Proteoform Suite for intact-mass identifications, which enables users to determine the ambiguity levels and sources of ambiguity for each intact-mass proteoform identification.

中文翻译：

从 21 特斯拉傅立叶变换离子回旋共振质谱自顶向下蛋白质组学数据构建人类蛋白质组

蛋白质的不同形式的蛋白质组的鉴定对于理解生物过程很重要。蛋白质型家族是来自同一基因的不同蛋白质型的集合。我们之前开发了 Proteoform Suite 软件程序，它构建了蛋白质型家族并通过完整质量分析识别蛋白质型。在这里，我们已将此方法应用于在国家高磁场实验室 21 特斯拉傅里叶变换离子回旋共振质谱仪（MassIVE 平台上可用的数据，标识符为 MSV000085978）获得的自上而下的蛋白质组学数据。我们探索了构建蛋白质型家族的能力，并从该仪器为来自 MCF-7 人乳腺癌细胞系的复杂细胞裂解物样品提供的高质量准确度数据中识别蛋白质型。共观察到2830个实验原型，其中932个被识别，44个不明确，1854个未识别。在 932 种独特鉴定的蛋白质型中，766 种是使用 TDPortal 通过自上而下的 MS2 分析以 1% 的错误发现率 (FDR) 鉴定的，166 种是使用 Proteoform Suite 进行的额外完整质量鉴定（约 4.7% 计算的全局 FDR）。我们最近发布了一个 proteoform level schema 来表示 proteoform 识别中的歧义。我们在 Proteoform Suite 中实现了这种蛋白质组水平分类，用于完整质量鉴定，这使用户能够确定每个完整质量蛋白质组鉴定的歧义水平和歧义来源。使用 TDPortal 通过自上而下的 MS2 分析以 1% 的错误发现率 (FDR) 鉴定了 766 个，另外 166 个是使用 Proteoform Suite 进行的额外完整质量鉴定（约 4.7% 计算的全局 FDR）。我们最近发布了一个 proteoform level schema 来表示 proteoform 识别中的歧义。我们在 Proteoform Suite 中实现了这种蛋白质组水平分类，用于完整质量鉴定，这使用户能够确定每个完整质量蛋白质组鉴定的歧义水平和歧义来源。使用 TDPortal 通过自上而下的 MS2 分析以 1% 的错误发现率 (FDR) 鉴定了 766 个，另外 166 个是使用 Proteoform Suite 进行的额外完整质量鉴定（约 4.7% 计算的全局 FDR）。我们最近发布了一个 proteoform level schema 来表示 proteoform 识别中的歧义。我们在 Proteoform Suite 中实现了这种蛋白质组水平分类，用于完整质量鉴定，这使用户能够确定每个完整质量蛋白质组鉴定的歧义水平和歧义来源。

更新日期：2020-10-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11