当前位置: X-MOL 学术Stat. Appl. Genet. Molecul. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Bayesian hierarchical model for identifying significant polygenic effects while controlling for confounding and repeated measures
Statistical Applications in Genetics and Molecular Biology ( IF 0.8 ) Pub Date : 2017-11-15 , DOI: 10.1515/sagmb-2017-0044
Christopher McMahan 1 , James Baurley 1 , William Bridges 1 , Chase Joyner 1 , Muhamad Fitra Kacamarga 1 , Robert Lund 1 , Carissa Pardamean 1 , Bens Pardamean 1
Affiliation  

Genomic studies of plants often seek to identify genetic factors associated with desirable traits. The process of evaluating genetic markers one by one (i.e. a marginal analysis) may not identify important polygenic and environmental effects. Further, confounding due to growing conditions/factors and genetic similarities among plant varieties may influence conclusions. When developing new plant varieties to optimize yield or thrive in future adverse conditions (e.g. flood, drought), scientists seek a complete understanding of how the factors influence desirable traits. Motivated by a study design that measures rice yield across different seasons, fields, and plant varieties in Indonesia, we develop a regression method that identifies significant genomic factors, while simultaneously controlling for field factors and genetic similarities in the plant varieties. Our approach develops a Bayesian maximum a posteriori probability (MAP) estimator under a generalized double Pareto shrinkage prior. Through a hierarchical representation of the proposed model, a novel and computationally efficient expectation-maximization (EM) algorithm is developed for variable selection and estimation. The performance of the proposed approach is demonstrated through simulation and is used to analyze rice yields from a pilot study conducted by the Indonesian Center for Rice Research.

中文翻译:

贝叶斯层次模型,用于识别显着的多基因效应,同时控制混杂和重复测量

植物的基因组研究通常试图确定与理想性状相关的遗传因素。逐一评估遗传标记的过程(即边际分析)可能无法识别重要的多基因和环境影响。此外,由于植物品种之间的生长条件/因素和遗传相似性造成的混淆可能会影响结论。在开发新的植物品种以优化产量或在未来的不利条件(例如洪水、干旱)中茁壮成长时,科学家们寻求全面了解这些因素如何影响理想的性状。受测量印度尼西亚不同季节、田地和植物品种的水稻产量的研究设计的启发,我们开发了一种回归方法来识别重要的基因组因素,同时控制植物品种中的田间因素和遗传相似性。我们的方法在广义双帕累托收缩先验下开发了贝叶斯最大后验概率 (MAP) 估计量。通过所提出模型的层次表示,一种新颖且计算效率高的期望最大化(EM)算法被开发用于变量选择和估计。所提出方法的性能通过模拟得到证明,并用于分析印度尼西亚水稻研究中心进行的一项试点研究的水稻产量。为变量选择和估计开发了一种新颖且计算效率高的期望最大化(EM)算法。所提出方法的性能通过模拟得到证明,并用于分析印度尼西亚水稻研究中心进行的一项试点研究的水稻产量。为变量选择和估计开发了一种新颖且计算效率高的期望最大化(EM)算法。所提出方法的性能通过模拟得到证明,并用于分析印度尼西亚水稻研究中心进行的一项试点研究的水稻产量。
更新日期:2017-11-15
down
wechat
bug