当前位置: X-MOL 学术Biol. Direct › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MetaBinG2: a fast and accurate metagenomic sequence classification system for samples with many unknown organisms.
Biology Direct ( IF 5.7 ) Pub Date : 2018-08-22 , DOI: 10.1186/s13062-018-0220-y
Yuyang Qiao 1 , Ben Jia 1, 2 , Zhiqiang Hu 1 , Chen Sun 1, 2 , Yijin Xiang 3 , Chaochun Wei 1, 2, 4
Affiliation  

BACKGROUND Many methods have been developed for metagenomic sequence classification, and most of them depend heavily on genome sequences of the known organisms. A large portion of sequencing sequences may be classified as unknown, which greatly impairs our understanding of the whole sample. RESULT Here we present MetaBinG2, a fast method for metagenomic sequence classification, especially for samples with a large number of unknown organisms. MetaBinG2 is based on sequence composition, and uses GPUs to accelerate its speed. A million 100 bp Illumina sequences can be classified in about 1 min on a computer with one GPU card. We evaluated MetaBinG2 by comparing it to multiple popular existing methods. We then applied MetaBinG2 to the dataset of MetaSUB Inter-City Challenge provided by CAMDA data analysis contest and compared community composition structures for environmental samples from different public places across cities. CONCLUSION Compared to existing methods, MetaBinG2 is fast and accurate, especially for those samples with significant proportions of unknown organisms. REVIEWERS This article was reviewed by Drs. Eran Elhaik, Nicolas Rascovan, and Serghei Mangul.

中文翻译:

MetaBinG2:一种快速、准确的宏基因组序列分类系统,适用于含有许多未知生物的样品。

背景技术已经开发了许多用于宏基因组序列分类的方法,并且其中大多数严重依赖于已知生物体的基因组序列。很大一部分测序序列可能被归类为未知,这极大地损害了我们对整个样本的理解。结果在这里,我们提出了 MetaBinG2,一种用于宏基因组序列分类的快速方法,特别是对于含有大量未知生物的样品。MetaBinG2基于序列合成,并使用GPU来加速其速度。在配备一块 GPU 卡的计算机上,大约 1 分钟即可对一百万个 100 bp Illumina 序列进行分类。我们通过将 MetaBinG2 与多种流行的现有方法进行比较来对其进行评估。然后,我们将MetaBinG2应用到CAMDA数据分析竞赛提供的MetaSUB Inter-City Challenge数据集上,并比较了来自城市不同公共场所的环境样本的群落组成结构。结论 与现有方法相比,MetaBinG2 快速且准确,特别是对于那些含有大量未知生物的样品。审稿人 本文由 Drs 审阅。埃兰·埃尔海克、尼古拉斯·拉斯科万和谢尔盖·曼古尔。
更新日期:2020-04-22
down
wechat
bug