当前位置: X-MOL 学术J. R. Stat. Soc. Ser. C Appl. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A computationally efficient Bayesian seemingly unrelated regressions model for high-dimensional quantitative trait loci discovery
The Journal of the Royal Statistical Society: Series C (Applied Statistics) ( IF 1.6 ) Pub Date : 2021-05-08 , DOI: 10.1111/rssc.12490
Leonardo Bottolo 1, 2, 3 , Marco Banterle 4 , Sylvia Richardson 2, 3 , Mika Ala-Korpela 5, 6 , Marjo-Riitta Järvelin 7, 8, 9, 10, 11 , Alex Lewin 4
Affiliation  

Our work is motivated by the search for metabolite quantitative trait loci (QTL) in a cohort of more than 5000 people. There are 158 metabolites measured by NMR spectroscopy in the 31-year follow-up of the Northern Finland Birth Cohort 1966 (NFBC66). These metabolites, as with many multivariate phenotypes produced by high-throughput biomarker technology, exhibit strong correlation structures. Existing approaches for combining such data with genetic variants for multivariate QTL analysis generally ignore phenotypic correlations or make restrictive assumptions about the associations between phenotypes and genetic loci. We present a computationally efficient Bayesian seemingly unrelated regressions model for high-dimensional data, with cell-sparse variable selection and sparse graphical structure for covariance selection. Cell sparsity allows different phenotype responses to be associated with different genetic predictors and the graphical structure is used to represent the conditional dependencies between phenotype variables. To achieve feasible computation of the large model space, we exploit a factorisation of the covariance matrix. Applying the model to the NFBC66 data with 9000 directly genotyped single nucleotide polymorphisms, we are able to simultaneously estimate genotype–phenotype associations and the residual dependence structure among the metabolites. The R package BayesSUR with full documentation is available at https://cran.r-project.org/web/packages/BayesSUR/

中文翻译:

用于高维数量性状基因座发现的计算效率高的看似无关的贝叶斯回归模型

我们的工作的动机是在 5000 多人的队列中寻找代谢物数量性状基因座 (QTL)。在 1966 年芬兰北部出生队列 (NFBC66) 的 31 年随访中,通过 NMR 光谱测量了 158 种代谢物。这些代谢物与通过高通量生物标志物技术产生的许多多变量表型一样,表现出很强的相关结构。将这些数据与遗传变异相结合以进行多变量 QTL 分析的现有方法通常忽略表型相关性或对表型和遗传基因座之间的关联做出限制性假设。我们为高维数据提出了一种计算效率高的看似无关的贝叶斯回归模型,具有单元稀疏变量选择和用于协方差选择的稀疏图形结构。细胞稀疏性允许不同的表型反应与不同的遗传预测因子相关联,并且图形结构用于表示表型变量之间的条件依赖性。为了实现大型模型空间的可行计算,我们利用协方差矩阵的分解。将该模型应用于具有 9000 个直接基因分型的单核苷酸多态性的 NFBC66 数据,我们能够同时估计基因型-表型关联和代谢物之间的残余依赖性结构。R包 将该模型应用于具有 9000 个直接基因分型的单核苷酸多态性的 NFBC66 数据,我们能够同时估计基因型-表型关联和代谢物之间的残余依赖性结构。R包 将该模型应用于具有 9000 个直接基因分型的单核苷酸多态性的 NFBC66 数据,我们能够同时估计基因型-表型关联和代谢物之间的残余依赖性结构。R包带有完整文档的BayesSUR可在 https://cran.r-project.org/web/packages/BayesSUR/ 获得
更新日期:2021-05-08
down
wechat
bug