当前位置: X-MOL 学术Genes › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Association Analysis and Meta-Analysis of Multi-Allelic Variants for Large-Scale Sequence Data
Genes ( IF 2.8 ) Pub Date : 2020-05-25 , DOI: 10.3390/genes11050586
Yu Jiang 1 , Sai Chen 2 , Xingyan Wang 1 , Mengzhen Liu 3 , William G Iacono 4 , John K Hewitt 5 , John E Hokanson 6 , Kenneth Krauter 5 , Markku Laakso 7 , Kevin W Li 8 , Sharon M Lutz 9 , Matthew McGue 3 , Anita Pandit 8 , Gregory J M Zajac 8 , Michael Boehnke 8 , Goncalo R Abecasis 8 , Scott I Vrieze 3 , Bibo Jiang 1 , Xiaowei Zhan 10 , Dajiang J Liu 1
Affiliation  

There is great interest in understanding the impact of rare variants in human diseases using large sequence datasets. In deep sequence datasets of >10,000 samples, ~10% of the variant sites are observed to be multi-allelic. Many of the multi-allelic variants have been shown to be functional and disease-relevant. Proper analysis of multi-allelic variants is critical to the success of a sequencing study, but existing methods do not properly handle multi-allelic variants and can produce highly misleading association results. We discuss practical issues and methods to encode multi-allelic sites, conduct single-variant and gene-level association analyses, and perform meta-analysis for multi-allelic variants. We evaluated these methods through extensive simulations and the study of a large meta-analysis of ~18,000 samples on the cigarettes-per-day phenotype. We showed that our joint modeling approach provided an unbiased estimate of genetic effects, greatly improved the power of single-variant association tests among methods that can properly estimate allele effects, and enhanced gene-level tests over existing approaches. Software packages implementing these methods are available online.

中文翻译:

大规模序列数据多等位基因变异的关联分析和元分析

人们对使用大型序列数据集了解罕见变异对人类疾病的影响非常感兴趣。在 >10,000 个样本的深度序列数据集中,观察到约 10% 的变异位点是多等位基因的。许多多等位基因变体已被证明具有功能性和疾病相关性。多等位基因变异的正确分析对于测序研究的成功至关重要,但现有方法不能正确处理多等位基因变异,并且会产生高度误导性的关联结果。我们讨论了编码多等位基因位点、进行单变体和基因水平关联分析以及对多等位基因变体进行荟萃分析的实际问题和方法。我们通过广泛的模拟和对 ~18 的大型荟萃分析的研究来评估这些方法,000 个关于每天香烟表型的样本。我们表明,我们的联合建模方法提供了对遗传效应的无偏估计,大大提高了可以正确估计等位基因效应的方法之间单变体关联测试的能力,并增强了现有方法的基因水平测试。实现这些方法的软件包可在线获得。
更新日期:2020-05-25
down
wechat
bug