当前位置: X-MOL 学术BMC Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Design of experiments for fine-mapping quantitative trait loci in livestock populations.
BMC Genetics ( IF 2.9 ) Pub Date : 2020-06-29 , DOI: 10.1186/s12863-020-00871-1
Dörte Wittenburg 1 , Sarah Bonk 2 , Michael Doschoris 1 , Henry Reyer 3
Affiliation  

Single nucleotide polymorphisms (SNPs) which capture a significant impact on a trait can be identified with genome-wide association studies. High linkage disequilibrium (LD) among SNPs makes it difficult to identify causative variants correctly. Thus, often target regions instead of single SNPs are reported. Sample size has not only a crucial impact on the precision of parameter estimates, it also ensures that a desired level of statistical power can be reached. We study the design of experiments for fine-mapping of signals of a quantitative trait locus in such a target region. A multi-locus model allows to identify causative variants simultaneously, to state their positions more precisely and to account for existing dependencies. Based on the commonly applied SNP-BLUP approach, we determine the z-score statistic for locally testing non-zero SNP effects and investigate its distribution under the alternative hypothesis. This quantity employs the theoretical instead of observed dependence between SNPs; it can be set up as a function of paternal and maternal LD for any given population structure. We simulated multiple paternal half-sib families and considered a target region of 1 Mbp. A bimodal distribution of estimated sample size was observed, particularly if more than two causative variants were assumed. The median of estimates constituted the final proposal of optimal sample size; it was consistently less than sample size estimated from single-SNP investigation which was used as a baseline approach. The second mode pointed to inflated sample sizes and could be explained by blocks of varying linkage phases leading to negative correlations between SNPs. Optimal sample size increased almost linearly with number of signals to be identified but depended much stronger on the assumption on heritability. For instance, three times as many samples were required if heritability was 0.1 compared to 0.3. An R package is provided that comprises all required tools. Our approach incorporates information about the population structure into the design of experiments. Compared to a conventional method, this leads to a reduced estimate of sample size enabling the resource-saving design of future experiments for fine-mapping of candidate variants.

中文翻译:

精细映射牲畜种群数量性状基因座的实验设计。

可以通过全基因组关联研究确定捕获对性状有重大影响的单核苷酸多态性(SNP)。SNP之间的高度连锁不平衡(LD)使得很难正确识别出致病变异。因此,经常报道目标区域而不是单个SNP。样本量不仅对参数估计的精度有至关重要的影响,而且还确保可以达到所需的统计功效水平。我们研究了在这样的目标区域中对数量性状基因座信号进行精细映射的实验设计。多基因座模型允许同时识别致病变体,以更精确地陈述其位置并解决现有的依赖性。基于常用的SNP-BLUP方法,我们确定用于局部测试非零SNP效应的z得分统计量,并在替代假设下调查其分布。此数量采用SNP之间的理论依赖性而不是观察到的依赖性。对于任何给定的人口结构,可以将其设置为父亲和母亲LD的函数。我们模拟了多个父亲半同胞家庭,并认为目标区域为1 Mbp。观察到了估计样本量的双峰分布,特别是如果假定了两个以上的致病变体。估计值的中位数构成最佳样本量的最终建议;它始终小于单次SNP调查估计的样本量,后者被用作基准方法。第二种模式指出了样本数量的膨胀,可以用连锁相位变化导致SNP之间负相关的方框来解释。最佳样本量几乎与要识别的信号数量呈线性关系,但对遗传力的假设则要强得多。例如,如果遗传力为0.1,则需要的样本数量是0.3的三倍。提供了一个R包,其中包含所有必需的工具。我们的方法将有关种群结构的信息整合到实验设计中。与传统方法相比,这可以减少对样本量的估计,从而可以节省资源,设计出未来实验以精确映射候选变体。最佳样本量几乎与要识别的信号数量呈线性关系,但对遗传力的假设则要强得多。例如,如果遗传力为0.1,则需要的样本数量是0.3的三倍。提供了一个R包,其中包含所有必需的工具。我们的方法将有关种群结构的信息整合到实验设计中。与传统方法相比,这可以减少对样本量的估计,从而可以节省资源,设计出未来实验以精确映射候选变体。最佳样本量几乎与要识别的信号数量呈线性关系,但对遗传力的假设则要强得多。例如,如果遗传力为0.1,则需要的样本数量是0.3的三倍。提供了一个R包,其中包含所有必需的工具。我们的方法将有关种群结构的信息整合到实验设计中。与传统方法相比,这可以减少对样本量的估计,从而可以节省资源,设计出未来实验以精确映射候选变体。我们的方法将有关种群结构的信息整合到实验设计中。与传统方法相比,这可以减少对样本量的估计,从而可以节省资源,设计出未来实验以精确映射候选变体。我们的方法将有关种群结构的信息整合到实验设计中。与传统方法相比,这可以减少对样本量的估计,从而可以节省资源,设计出未来实验以精确映射候选变体。
更新日期:2020-06-29
down
wechat
bug