当前位置: X-MOL 学术Genet. Epidemiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Estimation of DNA contamination and its sources in genotyped samples.
Genetic Epidemiology ( IF 2.1 ) Pub Date : 2019-08-26 , DOI: 10.1002/gepi.22257
Gregory J M Zajac 1 , Lars G Fritsche 1 , Joshua S Weinstock 1 , Susan L Dagenais 2 , Robert H Lyons 2 , Chad M Brummett 3 , Gonçalo R Abecasis 1
Affiliation  

Array genotyping is a cost-effective and widely used tool that enables assessment of up to millions of genetic markers in hundreds of thousands of individuals. Genotyping array data are typically highly accurate but sensitive to mixing of DNA samples from multiple individuals before or during genotyping. Contaminated samples can lead to genotyping errors and consequently cause false positive signals or reduce power of association analyses. Here, we propose a new method to identify contaminated samples and the sources of contamination within a genotyping batch. Through analysis of array intensity and genotype data from intentionally mixed samples and 22,366 samples of the Michigan Genomics Initiative, an ongoing biobank-based study, we show that our method can reliably estimate contamination. We also show that identifying sources of contamination can implicate problematic sample processing steps and guide process improvements. Compared to existing methods, our approach can estimate the proportion of contaminating DNA more accurately, eliminate the need for external databases of allele frequencies, and provide contamination estimates that are more robust to the ancestral origin of the contaminating sample.

中文翻译:

基因型样品中DNA污染及其来源的估计。

阵列基因分型是一种经济高效且使用广泛的工具,可对数十万个人进行多达数百万种遗传标记的评估。基因分型阵列数据通常是高度准确的,但是对在基因分型之前或过程中来自多个个体的DNA样品的混合敏感。污染的样品可能导致基因分型错误,从而导致假阳性信号或降低关联分析的功效。在这里,我们提出了一种新的方法来识别受污染的样品以及基因分型批次中的污染源。通过对密歇根州基因组计划的有意混合样品和22,366个样品的阵列强度和基因型数据进行分析,这项正在进行的基于生物库的研究表明,我们的方法可以可靠地估算污染。我们还表明,识别污染源可能会牵涉有问题的样品处理步骤并指导流程改进。与现有方法相比,我们的方法可以更准确地估计污染性DNA的比例,无需使用等位基因频率的外部数据库,并提供对污染性样品的祖先更为可靠的污染性估算。
更新日期:2019-11-01
down
wechat
bug