当前位置: X-MOL 学术Proteomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Tail-Robust Quantile Normalization.
Proteomics ( IF 3.4 ) Pub Date : 2020-08-31 , DOI: 10.1002/pmic.202000068
Eva Brombacher 1, 2, 3, 4, 5 , Ariane Schad 6 , Clemens Kreutz 1, 3
Affiliation  

High‐throughput biological data—such as mass spectrometry (MS)‐based proteomics data—suffer from systematic non‐biological variance due to systematic errors. This hinders the estimation of “real” biological signals and, in turn, decreases the power of statistical tests and biases the identification of differentially expressed proteins. To remove such unintended variation, while retaining the biological signal of interest, analysis workflows for quantitative MS data typically comprise normalization prior to their statistical analysis. Several normalization methods, such as quantile normalization (QN), have originally been developed for microarray data. In contrast to microarray data proteomics data may contain features, in the form of protein intensities that are consistently high across experimental conditions and, hence, are encountered in the tails of the protein intensity distribution. If QN is applied in the presence of such proteins statistical inferences of the features’ intensity profiles are impeded due to the biased estimation of their variance. A freely available, novel approach is introduced which serves as an improvement of the classical QN by preserving the biological signals of features in the tails of the intensity distribution and by accounting for sample‐dependent missing values (MVs): The “tail‐robust quantile normalization” (TRQN).

中文翻译:

Tail-Robust 分位数归一化。

由于系统误差,高通量生物数据——例如基于质谱 (MS) 的蛋白质组学数据——会受到系统性非生物变异的影响。这阻碍了对“真实”生物信号的估计,进而降低了统计检验的能力,并使差异表达蛋白质的鉴定产生偏差。为了消除这种意外变化,同时保留感兴趣的生物信号,定量 MS 数据的分析工作流程通常包括在统计分析之前进行归一化。一些归一化方法,例如分位数归一化 (QN),最初是为微阵列数据开发的。与微阵列数据相比,蛋白质组学数据可能包含特征,以蛋白质强度的形式在整个实验条件下始终保持高水平,因此,在蛋白质强度分布的尾部遇到。如果在存在此类蛋白质的情况下应用 QN,则特征强度分布的统计推断将因对其方差的有偏估计而受阻。引入了一种免费可用的新颖方法,该方法通过保留强度分布尾部特征的生物信号并考虑样本相关缺失值 (MV) 来改进经典 QN:“尾部稳健分位数”标准化”(TRQN)。
更新日期:2020-08-31
down
wechat
bug