当前位置: X-MOL 学术Ann. Appl. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bayesian mixed effects models for zero-inflated compositions in microbiome data analysis
Annals of Applied Statistics ( IF 1.3 ) Pub Date : 2020-04-16 , DOI: 10.1214/19-aoas1295
Boyu Ren , Sergio Bacallado , Stefano Favaro , Tommi Vatanen , Curtis Huttenhower , Lorenzo Trippa

Detecting associations between microbial compositions and sample characteristics is one of the most important tasks in microbiome studies. Most of the existing methods apply univariate models to single microbial species separately, with adjustments for multiple hypothesis testing. We propose a Bayesian analysis for a generalized mixed effects linear model tailored to this application. The marginal prior on each microbial composition is a Dirichlet process, and dependence across compositions is induced through a linear combination of individual covariates, such as disease biomarkers or the subject’s age, and latent factors. The latent factors capture residual variability and their dimensionality is learned from the data in a fully Bayesian procedure. The proposed model is tested in data analyses and simulation studies with zero-inflated compositions. In these settings and within each sample, a large proportion of counts per microbial species are equal to zero. In our Bayesian model a priori the probability of compositions with absent microbial species is strictly positive. We propose an efficient algorithm to sample from the posterior and visualizations of model parameters which reveal associations between covariates and microbial compositions. We evaluate the proposed method in simulation studies, and then analyze a microbiome dataset for infants with type 1 diabetes which contains a large proportion of zeros in the sample-specific microbial compositions.

中文翻译:

微生物组数据分析中零膨胀成分的贝叶斯混合效应模型

在微生物组研究中,检测微生物成分与样品特性之间的关联是最重要的任务之一。大多数现有方法将单变量模型分别应用于单个微生物物种,并针对多个假设检验进行了调整。我们针对适合此应用的广义混合效应线性模型提出了贝叶斯分析。每个微生物组合物的边缘先验是Dirichlet过程,并且通过各个协变量(如疾病生物标志物或受试者的年龄以及潜在因素)的线性组合来诱导对组合物的依赖性。潜在因素捕获了残差,并且它们的维数是通过完全贝叶斯程序从数据中获悉的。所提出的模型在零膨胀成分的数据分析和模拟研究中进行了测试。在这些设置中以及每个样本中,每种微生物物种的计数很大一部分等于零。在我们的贝叶斯模型中,先验概率是缺乏微生物物种的成分严格为正。我们提出了一种有效的算法,可以从模型参数的后验和可视化中进行采样,以揭示协变量与微生物成分之间的关​​联。我们在仿真研究中评估了提出的方法,然后分析了1型糖尿病婴儿的微生物组数据集,该数据集在特定于样品的微生物成分中包含很大比例的零。在我们的贝叶斯模型中,先验概率是缺乏微生物物种的成分严格为正。我们提出了一种有效的算法,可以从模型参数的后验和可视化中进行采样,以揭示协变量与微生物成分之间的关​​联。我们在仿真研究中评估了提出的方法,然后分析了1型糖尿病婴儿的微生物组数据集,该数据集在特定于样品的微生物成分中包含很大比例的零。在我们的贝叶斯模型中,先验概率是缺乏微生物物种的成分严格为正。我们提出了一种有效的算法,可以从模型参数的后验和可视化中进行采样,以揭示协变量与微生物成分之间的关​​联。我们在仿真研究中评估了提出的方法,然后分析了1型糖尿病婴儿的微生物组数据集,该数据集在特定于样品的微生物成分中包含很大比例的零。
更新日期:2020-04-16
down
wechat
bug