当前位置: X-MOL 学术Genet. Epidemiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Inverse probability weighting is an effective method to address selection bias during the analysis of high dimensional data
Genetic Epidemiology ( IF 1.7 ) Pub Date : 2021-06-15 , DOI: 10.1002/gepi.22418
Patrick M Carry 1, 2 , Lauren A Vanderlinden 1, 3 , Fran Dong 4 , Teresa Buckner 1 , Elizabeth Litkowski 1 , Timothy Vigers 3 , Jill M Norris 1 , Katerina Kechris 3
Affiliation  

Omics studies frequently use samples collected during cohort studies. Conditioning on sample availability can cause selection bias if sample availability is nonrandom. Inverse probability weighting (IPW) is purported to reduce this bias. We evaluated IPW in an epigenome-wide analysis testing the association between DNA methylation (261,435 probes) and age in healthy adolescent subjects (n = 114). We simulated age and sex to be correlated with sample selection and then evaluated four conditions: complete population/no selection bias (all subjects), naïve selection bias (no adjustment), and IPW selection bias (selection bias with IPW adjustment). Assuming the complete population condition represented the “truth,” we compared each condition to the complete population condition. Bias or difference in associations between age and methylation was reduced in the IPW condition versus the naïve condition. However, genomic inflation and type 1 error were higher in the IPW condition relative to the naïve condition. Postadjustment using bacon, type 1 error and inflation were similar across all conditions. Power was higher under the IPW condition compared with the naïve condition before and after inflation adjustment. IPW methods can reduce bias in genome-wide analyses. Genomic inflation is a potential concern that can be minimized using methods that adjust for inflation.

中文翻译:

逆概率加权是解决高维数据分析过程中选择偏差的有效方法

组学研究经常使用在队列研究期间收集的样本。如果样本可用性是非随机的,则以样本可用性为条件可能会导致选择偏差。逆概率加权(IPW)旨在减少这种偏差。我们在一项表观基因组分析中评估了 IPW,该分析测试了健康青少年受试者的 DNA 甲基化(261,435 个探针)与年龄之间的关联(n = 114)。我们模拟了与样本选择相关的年龄和性别,然后评估了四个条件:完全人群/无选择偏差(所有受试者)、幼稚选择偏差(无调整)和 IPW 选择偏差(选择偏差与 IPW 调整)。假设完整的人口条件代表“真相”,我们将每个条件与完整的人口条件进行比较。IPW 条件与幼稚条件相比,年龄和甲基化之间关联的偏差或差异减少。然而,相对于初始条件,IPW 条件下的基因组膨胀和 1 型错误更高。使用培根、1 型错误和通货膨胀的后调整在所有条件下都是相似的。与通货膨胀调整前后的幼稚条件相比,IPW 条件下的功率更高。IPW 方法可以减少全基因组分析中的偏差。基因组膨胀是一个潜在的问题,可以使用针对通货膨胀进行调整的方法将其最小化。
更新日期:2021-08-19
down
wechat
bug