当前位置: X-MOL 学术Biom. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sampling uncertainty versus method uncertainty: A general framework with applications to omics biomarker selection
Biometrical Journal ( IF 1.3 ) Pub Date : 2019-05-17 , DOI: 10.1002/bimj.201800309
Simon Klau 1 , Marie-Laure Martin-Magniette 2, 3, 4 , Anne-Laure Boulesteix 1 , Sabine Hoffmann 1
Affiliation  

Uncertainty is a crucial issue in statistics which can be considered from different points of view. One type of uncertainty, typically referred to as sampling uncertainty, arises through the variability of results obtained when the same analysis strategy is applied to different samples. Another type of uncertainty arises through the variability of results obtained when using the same sample but different analysis strategies addressing the same research question. We denote this latter type of uncertainty as method uncertainty. It results from all the choices to be made for an analysis, for example, decisions related to data preparation, method choice, or model selection. In medical sciences, a large part of omics research is focused on the identification of molecular biomarkers, which can either be performed through ranking or by selection from among a large number of candidates. In this paper, we introduce a general resampling-based framework to quantify and compare sampling and method uncertainty. For illustration, we apply this framework to different scenarios related to the selection and ranking of omics biomarkers in the context of acute myeloid leukemia: variable selection in multivariable regression using different types of omics markers, the ranking of biomarkers according to their predictive performance, and the identification of differentially expressed genes from RNA-seq data. For all three scenarios, our findings suggest highly unstable results when the same analysis strategy is applied to two independent samples, indicating high sampling uncertainty and a comparatively smaller, but non-negligible method uncertainty, which strongly depends on the methods being compared.

中文翻译:

采样不确定性与方法不确定性:适用于组学生物标志物选择的通用框架

不确定性是统计学中的一个关键问题,可以从不同的角度加以考虑。一种不确定性,通常称为采样不确定性,是由于将相同的分析策略应用于不同的样品时所获得的结果的可变性而产生的。另一种不确定性是通过使用相同样本但针对同一研究问题的不同分析策略所获得的结果的可变性而产生的。我们将后一种不确定性称为方法不确定性。它源于为分析做出的所有选择,例如与数据准备、方法选择或模型选择相关的决策。在医学科学中,很大一部分组学研究集中在分子生物标志物的鉴定上,这可以通过排名或从大量候选人中选择来执行。在本文中,我们介绍了一个基于重采样的通用框架来量化和比较采样和方法的不确定性。为了说明,我们将此框架应用于与急性髓系白血病背景下组学生物标志物的选择和排序相关的不同场景:使用不同类型组学标记的多变量回归中的变量选择,根据其预测性能对生物标志物进行排序,以及从 RNA-seq 数据中识别差异表达的基因。对于所有三种情况,我们的研究结果表明,当将相同的分析策略应用于两个独立样本时,结果非常不稳定,表明采样不确定性高,相对较小,
更新日期:2019-05-17
down
wechat
bug