当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An empirical Bayes approach to normalization and differential abundance testing for microbiome data.
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2020-06-03 , DOI: 10.1186/s12859-020-03552-z
Tiantian Liu 1, 2 , Hongyu Zhao 2, 3 , Tao Wang 1, 2, 4
Affiliation  

Advances in DNA sequencing have offered researchers an unprecedented opportunity to better study the variety of species living in and on the human body. However, the analysis of microbiome data is complicated by several challenges. First, the sequencing depth may vary by orders of magnitude across samples. Second, species are rare and the data often contain many zeros. Third, the specimen is a fraction of the microbial ecosystem, and so the data are compositional carrying only relative information. Other characteristics of microbiome data include pronounced over-dispersion in taxon abundances, and the existence of a phylogenetic tree that relates all bacterial species. To address some of these challenges, microbiome analysis workflows often normalize the read counts prior to downstream analysis. However, there are limitations in the current literature on the normalization of microbiome data. Under the multinomial distribution for the read counts and a prior for the unknown proportions, we propose an empirical Bayes approach to microbiome data normalization. Using a tree-based extension of the Dirichlet prior, we further extend our method by incorporating the phylogenetic tree into the normalization process. We study the impact of normalization on differential abundance analysis. In the presence of tree structure, we propose a phylogeny-aware detection procedure. Extensive simulations and gut microbiome data applications are conducted to demonstrate the superior performance of our empirical Bayes method over other normalization methods, and over commonly-used methods for differential abundance testing. Original R scripts are available at GitHub (https://github.com/liudoubletian/eBay).

中文翻译:

微生物组数据标准化和差异丰度测试的经验贝叶斯方法。

DNA 测序的进步为研究人员提供了一个前所未有的机会,可以更好地研究生活在人体中和人体上的各种物种。然而,微生物组数据的分析因若干挑战而变得复杂。首先,测序深度可能会因样本之间的数量级而异。其次,物种稀少,数据往往包含许多零。第三,标本是微生物生态系统的一小部分,因此数据是组成的,仅包含相关信息。微生物组数据的其他特征包括分类群丰度的明显过度分散,以及与所有细菌物种相关的系统发育树的存在。为了解决其中一些挑战,微生物组分析工作流程通常在下游分析之前对读取计数进行标准化。然而,目前关于微生物组数据标准化的文献存在局限性。在读取计数的多项分布和未知比例的先验条件下,我们提出了一种微生物组数据归一化的经验贝叶斯方法。使用 Dirichlet 先验的基于树的扩展,我们通过将系统发育树纳入规范化过程来进一步扩展我们的方法。我们研究归一化对差异丰度分析的影响。在存在树结构的情况下,我们提出了一种系统发育感知检测程序。进行了广泛的模拟和肠道微生物组数据应用,以证明我们的经验贝叶斯方法优于其他标准化方法和常用的差异丰度测试方法。
更新日期:2020-06-03
down
wechat
bug