当前位置: X-MOL 学术Stat. Interface › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An extended Tajima’s D neutrality test incorporating SNP calling and imputation uncertainties
Statistics and Its Interface ( IF 0.8 ) Pub Date : 2015-01-01 , DOI: 10.4310/sii.2015.v8.n4.a4
Qingrun Zhang 1 , Chris Tyler-Smith 2 , Quan Long 1
Affiliation  

To identify evolutionary events from the footprints left in the patterns of genetic variation in a population, people use many statistical frameworks, including neutrality tests. In datasets from current high throughput sequencing and genotyping platforms, it is common to have missing data and low-confidence SNP calls at many segregating sites. However, the traditional statistical framework for neutrality tests does not allow for these possibilities; therefore the usual way of treating missing data is to ignore segregating sites with missing/low confidence calls, regardless of the good SNP calls at these sites in other individuals. In this work, we propose a modified neutrality test, Extended Tajima's D, which incorporates missing data and SNP-calling uncertainties. Because we do not specify any particular error-generating mechanism, this approach is robust and widely applicable. Simulations show that in most cases the power of the new test is better than the original Tajima's D, given the same type I error. Applications to real data show that it detects fewer outliers associated with low quality data.

中文翻译:

包含 SNP 调用和插补不确定性的扩展 Tajima D 中性测试

为了从种群遗传变异模式中留下的足迹中识别进化事件,人们使用了许多统计框架,包括中性检验。在当前高通量测序和基因分型平台的数据集中,在许多分离位点丢失数据和低置信度 SNP 调用是很常见的。然而,中立性检验的传统统计框架不允许这些可能性;因此,处理缺失数据的常用方法是忽略具有缺失/低置信度调用的分离位点,而不管其他个体在这些位点的良好 SNP 调用如何。在这项工作中,我们提出了一种改进的中性测试,即扩展田岛 D,它结合了缺失数据和 SNP 调用的不确定性。因为我们没有指定任何特定的错误生成机制,这种方法稳健且广泛适用。模拟表明,在大多数情况下,新测试的功效优于原始 Tajima 的 D,假设相同的 I 类错误。实际数据的应用表明,它检测到的与低质量数据相关的异常值更少。
更新日期:2015-01-01
down
wechat
bug