当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Nubeam reference-free approach to analyze metagenomic sequencing reads.
Genome Research ( IF 6.2 ) Pub Date : 2020-09-01 , DOI: 10.1101/gr.261750.120
Hang Dai 1 , Yongtao Guan 1
Affiliation  

We present Nubeam (nucleotide be a matrix) as a novel reference-free approach to analyze short sequencing reads. Nubeam represents nucleotides by matrices, transforms a read into a product of matrices, and assigns numbers to reads based on the product matrix. Nubeam capitalizes on the noncommutative property of matrix multiplication, such that different reads are assigned different numbers and similar reads similar numbers. A sample, which is a collection of reads, becomes a collection of numbers that form an empirical distribution. We demonstrate that the genetic difference between samples can be quantified by the distance between empirical distributions. Nubeam includes the k-mer method as a special case, but unlike the k-mer method, it is convenient for Nubeam to account for GC bias and nucleotide quality. As a reference-free approach, Nubeam avoids reference bias and mapping bias, and can work with organisms without reference genomes. Thus, Nubeam is ideal to analyze data sets from metagenomics whole genome shotgun (WGS) sequencing, where the amount of unmapped reads is substantial. When applied to a WGS sequencing data set to quantify distances between metagenomics samples from various human body habitats, Nubeam recapitulates findings made by mapping-based methods and sheds light on contributions of unmapped reads. Nubeam is also useful in analyzing 16S rRNA sequencing data, which is a more prevalent type of data set in metagenomics studies. In our analysis, Nubeam recapitulated the findings that natural microbiota in mouse gut are resilient under challenges, and Nubeam detected differences in vaginal microbiota between cases of polycystic ovary syndrome and healthy controls.

中文翻译:

用于分析宏基因组测序读数的 Nubeam 无参考方法。

我们提出Nubeam(Ñ ucleotide b Ë ATRIX)作为一种新的无参考的方法来分析短测序读数。Nubeam 通过矩阵表示核苷酸,将读取转换为矩阵的乘积,并根据乘积矩阵为读取分配编号。Nubeam 利用矩阵乘法的非交换特性,不同的 read 被分配不同的编号,相似的 read 分配相似的编号。一个样本,即读取的集合,变成了形成经验分布的数字集合。我们证明样本之间的遗传差异可以通过经验分布之间的距离来量化。Nubeam 包括k-mer 方法作为特例,但与k不同-mer 方法,方便 Nubeam 考虑 GC 偏差和核苷酸质量。作为一种无参考方法,Nubeam 避免了参考偏差和映射偏差,并且可以处理没有参考基因组的生物体。因此,Nubeam 非常适合分析来自宏基因组学全基因组鸟枪法 (WGS) 测序的数据集,其中未映射的读数数量很大。当应用于 WGS 测序数据集以量化来自不同人体栖息地的宏基因组学样本之间的距离时,Nubeam 概括了基于映射的方法的发现,并阐明了未映射读取的贡献。Nubeam 还可用于分析 16S rRNA 测序数据,这是宏基因组学研究中更普遍的数据集类型。在我们的分析中,Nubeam 概括了小鼠肠道中的天然微生物群在挑战下具有弹性的发现,
更新日期:2020-09-15
down
wechat
bug