当前位置: X-MOL 学术Mol. Omics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust determination of differential abundance in shotgun proteomics using nonparametric statistics
Molecular Omics ( IF 2.9 ) Pub Date : 2018-09-17 , DOI: 10.1039/c8mo00077h
Patrick Slama 1 , Michael R Hoopmann , Robert L Moritz , Donald Geman
Affiliation  

Label-free shotgun mass spectrometry enables the detection of significant changes in protein abundance between different conditions. Due to often limited cohort sizes or replication, large ratios of potential protein markers to number of samples, as well as multiple null measurements pose important technical challenges to conventional parametric models. From a statistical perspective, a scenario similar to that of unlabeled proteomics is encountered in genomics when looking for differentially expressed genes. Still, the difficulty of detecting a large fraction of the true positives without a high false discovery rate is arguably greater in proteomics due to even smaller sample sizes and peptide-to-peptide variability in detectability. These constraints argue for nonparametric (or distribution-free) tests on normalized peptide values, thus minimizing the number of free parameters, as well as for measuring significance with permutation testing. We propose such a procedure with a class-based statistic, no parametric assumptions, and no parameters to select other than a nominal false discovery rate. Our method was tested on a new dataset which is available via ProteomeXchange with identifier PXD006447. The dataset was prepared using a standard proteolytic digest of a human protein mixture at 1.5-fold to 3-fold protein concentration changes and diluted into a constant background of yeast proteins. We demonstrate its superiority relative to other approaches in terms of the realized sensitivity and realized false discovery rates determined by ground truth, and recommend it for detecting differentially abundant proteins from MS data.

中文翻译:

使用非参数统计稳健测定鸟枪蛋白质组学中的差异丰度

无标记鸟枪质谱法能够检测不同条件下蛋白质丰度的显着变化。由于队列大小或复制通常有限,潜在蛋白质标记物与样本数量的较大比例以及多个无效测量对传统参数模型构成了重要的技术挑战。从统计学的角度来看,在寻找差异表达基因时,基因组学中会遇到类似于未标记蛋白质组学的情况。尽管如此,在蛋白质组学中,由于样本量更小以及可检测性中肽与肽之间的变异性,在没有高错误发现率的情况下检测大部分真阳性的难度可能更大。这些限制支持对标准化肽值进行非参数(或无分布)测试,从而最大限度地减少自由参数的数量,以及通过排列测试来测量显着性。我们提出这样一个过程,具有基于类别的统计,没有参数假设,并且除了名义错误发现率之外没有可供选择的参数。我们的方法在新数据集上进行了测试,该数据集可通过ProteomeXchange 获得,标识符为 PXD006447。该数据集是使用人类蛋白质混合物的标准蛋白水解消化在 1.5 倍至 3 倍蛋白质浓度变化下制备的,并稀释到酵母蛋白质的恒定背景中。我们证明了它相对于其他方法在实现的灵敏度和由地面事实确定的错误发现率方面的优越性,并推荐它用于从 MS 数据中检测差异丰度的蛋白质。
更新日期:2018-12-03
down
wechat
bug