当前位置: X-MOL 学术Front. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hierarchical Modelling of Haplotype Effects on a Phylogeny
Frontiers in Genetics ( IF 2.8 ) Pub Date : 2020-12-15 , DOI: 10.3389/fgene.2020.531218
Maria Lie Selle 1 , Ingelin Steinsland 1 , Finn Lindgren 2 , Vladimir Brajkovic 3 , Vlatka Cubric-Curik 3 , Gregor Gorjanc 4
Affiliation  

We introduce a hierarchical model to estimate haplotype effects based on phylogenetic relationships between haplotypes and their association with observed phenotypes. In a population there are many, but not all possible, distinct haplotypes and few observations per haplotype. Further, haplotype frequencies tend to vary substantially. Such data structure challenge estimation of haplotype effects. However, haplotypes often differ only due to few mutations, and leveraging similarities can improve the estimation of effects. We build on extensive literature and develop an autoregressive model of order one that models haplotype effects by leveraging phylogenetic relationships described with a directed acyclic graph. The phylogenetic relationships can be either in a form of a tree or a network, and we refer to the model as the haplotype network model. The model can be included as a component in a phenotype model to estimate associations between haplotypes and phenotypes. Our key contribution is that we obtain a sparse model, and by using hierarchical autoregression, the flow of information between similar haplotypes is estimated from the data. A simulation study shows that the hierarchical model can improve estimates of haplotype effects compared to an independent haplotype model, especially with few observations for a specific haplotype. We also compared it to a mutation model and observed comparable performance, though the haplotype model has the potential to capture background specific effects. We demonstrate the model with a study of mitochondrial haplotype effects on milk yield in cattle. We provide R code to fit the model with the INLA package.



中文翻译:

单倍型对系统发育影响的分层建模

我们引入了一个层次模型来根据单倍型之间的系统发育关系及其与观察到的表型的关联来估计单倍型效应。在一个群体中,有许多(但并非所有可能)不同的单倍型,并且每种单倍型的观察很少。此外,单倍型频率往往变化很大。这种数据结构对单倍型效应的估计提出了挑战。然而,单倍型通常仅由于很少的突变而有所不同,利用相似性可以改善对效应的估计。我们以大量文献为基础,开发了一种一阶自回归模型,通过利用有向无环图描述的系统发育关系来模拟单倍型效应。系统发育关系可以是树的形式,也可以是网络的形式,我们将该模型称为单倍型网络模型。该模型可以作为表型模型的一个组件来估计单倍型和表型之间的关联。我们的主要贡献是我们获得了一个稀疏模型,并通过使用分层自回归,从数据中估计了相似单倍型之间的信息流。模拟研究表明,与独立单倍型模型相比,分层模型可以改进对单倍型效应的估计,特别是在对特定单倍型的观察很少的情况下。我们还将其与突变模型进行了比较,并观察到了可比较的性能,尽管单倍型模型有可能捕获背景特定效应。我们通过研究线粒体单倍型对牛产奶量的影响来证明该模型。我们提供 R 代码来使模型与 INLA 包相匹配。

更新日期:2021-01-16
down
wechat
bug