当前位置: X-MOL 学术Mol. Ecol. Resour. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dirichlet-multinomial modelling outperforms alternatives for analysis of microbiome and other ecological count data.
Molecular Ecology Resources ( IF 7.7 ) Pub Date : 2020-01-29 , DOI: 10.1111/1755-0998.13128
Joshua G Harrison 1 , W John Calder 1 , Vivaswat Shastry 1 , C Alex Buerkle 1
Affiliation  

Molecular ecology regularly requires the analysis of count data that reflect the relative abundance of features of a composition (e.g., taxa in a community, gene transcripts in a tissue). The sampling process that generates these data can be modelled using the multinomial distribution. Replicate multinomial samples inform the relative abundances of features in an underlying Dirichlet distribution. These distributions together form a hierarchical model for relative abundances among replicates and sampling groups. This type of Dirichlet-multinomial modelling (DMM) has been described previously, but its benefits and limitations are largely untested. With simulated data, we quantified the ability of DMM to detect differences in proportions between treatment and control groups, and compared the efficacy of three computational methods to implement DMM-Hamiltonian Monte Carlo (HMC), variational inference (VI), and Gibbs Markov chain Monte Carlo. We report that DMM was better able to detect shifts in relative abundances than analogous analytical tools, while identifying an acceptably low number of false positives. Among methods for implementing DMM, HMC provided the most accurate estimates of relative abundances, and VI was the most computationally efficient. The sensitivity of DMM was exemplified through analysis of previously published data describing lung microbiomes. We report that DMM identified several potentially pathogenic, bacterial taxa as more abundant in the lungs of children who aspirated foreign material during swallowing; these differences went undetected with different statistical approaches. Our results suggest that DMM has strong potential as a statistical method to guide inference in molecular ecology.

中文翻译:

狄利克雷多项式建模在分析微生物组和其他生态计数数据方面优于其他方法。

分子生态学通常需要对计数数据进行分析,以反映组合物特征的相对丰度(例如,群落中的分类单元,组织中的基因转录本)。可以使用多项式分布对生成这些数据的采样过程进行建模。复制多项式样本可得出基本狄利克雷分布中特征的相对丰度。这些分布共同形成了一个层次模型,用于重复和采样组之间的相对丰度。前面已经描述了这种类型的Dirichlet多项式建模(DMM),但是其优点和局限性在很大程度上未经测试。利用模拟数据,我们量化了DMM检测治疗组和对照组之间比例差异的能力,并比较了三种计算方法实现DMM-哈密顿蒙特卡洛(HMC),变分推断(VI)和Gibbs Markov链蒙特卡洛的功效。我们报告说,DMM能够比类似的分析工具更好地检测相对丰度的变化,同时还能识别出少量可接受的假阳性。在实现DMM的方法中,HMC提供了相对丰度的最准确的估计,而VI在计算上是最有效的。通过对先前发表的描述肺微生物组学的数据进行分析,可以说明DMM的敏感性。我们报告说,DMM在吞咽过程中吸入异物的儿童的肺部发现了几种潜在的致病细菌类群,它们的含量更高。使用不同的统计方法无法发现这些差异。
更新日期:2020-01-29
down
wechat
bug