当前位置: X-MOL 学术Nucleic Acids Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Taxonomic classification method for metagenomics based on core protein families with Core-Kaiju.
Nucleic Acids Research ( IF 16.6 ) Pub Date : 2020-07-07 , DOI: 10.1093/nar/gkaa568
Anna Tovo 1, 2 , Peter Menzel 3 , Anders Krogh 4 , Marco Cosentino Lagomarsino 5, 6 , Samir Suweis 1, 7
Affiliation  

Characterizing species diversity and composition of bacteria hosted by biota is revolutionizing our understanding of the role of symbiotic interactions in ecosystems. Determining microbiomes diversity implies the assignment of individual reads to taxa by comparison to reference databases. Although computational methods aimed at identifying the microbe(s) taxa are available, it is well known that inferences using different methods can vary widely depending on various biases. In this study, we first apply and compare different bioinformatics methods based on 16S ribosomal RNA gene and shotgun sequencing to three mock communities of bacteria, of which the compositions are known. We show that none of these methods can infer both the true number of taxa and their abundances. We thus propose a novel approach, named Core-Kaiju, which combines the power of shotgun metagenomics data with a more focused marker gene classification method similar to 16S, but based on emergent statistics of core protein domain families. We thus test the proposed method on various mock communities and we show that Core-Kaiju reliably predicts both number of taxa and abundances. Finally, we apply our method on human gut samples, showing how Core-Kaiju may give more accurate ecological characterization and a fresh view on real microbiomes.

中文翻译:

基于核心蛋白家族的核心基因组的宏基因组学分类学分类方法。

表征生物群中细菌的物种多样性和组成,正在彻底改变我们对共生相互作用在生态系统中的作用的理解。确定微生物群落多样性意味着与参考数据库相比,将单个读段分配给分类单元。尽管可以使用旨在识别微生物类群的计算方法,但众所周知,使用不同方法的推论可能会因各种偏差而有很大差异。在这项研究中,我们首先将基于16S核糖体RNA基因和shot弹枪测序的不同生物信息学方法应用于已知的细菌组成的三个模拟群落。我们表明,这些方法都不能同时推断出分类单元的真实数量及其丰度。因此,我们提出了一种名为Core-Kaiju的新颖方法,它结合了of弹枪宏基因组学数据的功能和更集中的类似于16S的标记基因分类方法,但基于核心蛋白结构域家族的新兴统计数据。因此,我们在各种模拟社区上测试了所提出的方法,并且我们表明Core-Kaiju可以可靠地预测分类单元的数量和数量。最后,我们将我们的方法应用于人体肠道样本,表明Core-Kaiju如何提供更准确的生态特征以及对真实微生物群落的新见解。
更新日期:2020-07-07
down
wechat
bug