当前位置: X-MOL 学术mSystems › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes
mSystems ( IF 6.4 ) Pub Date : 2020-10-27 , DOI: 10.1128/msystems.00920-20
Harald R. Gruber-Vodicka 1 , Brandon K. B. Seah 1 , Elmar Pruesse 2
Affiliation  

The small-subunit rRNA (SSU rRNA) gene is the key marker in molecular ecology for all domains of life, but it is largely absent from metagenome-assembled genomes that often are the only resource available for environmental microbes. Here, we present phyloFlash, a pipeline to overcome this gap with rapid, SSU rRNA-centered taxonomic classification, targeted assembly, and graph-based binning of full metagenomic assemblies. We show that a cleanup of artifacts is pivotal even with a curated reference database. With such a filtered database, the general-purpose mapper BBmap extracts SSU rRNA reads five times faster than the rRNA-specialized tool SortMeRNA with similar sensitivity and higher selectivity on simulated metagenomes. Reference-based targeted assemblers yielded either highly fragmented assemblies or high levels of chimerism, so we employ the general-purpose genomic assembler SPAdes. Our optimized implementation is independent of reference database composition and has satisfactory levels of chimera formation. phyloFlash quickly processes Illumina (meta)genomic data, is straightforward to use, even as part of high-throughput quality control, and has user-friendly output reports. The software is available at https://github.com/HRGV/phyloFlash (GPL3 license) and is documented with an online manual.

中文翻译:

phyloFlash:快速的小亚基rRNA分析和来自基因组学的靶向组装

小亚基rRNA(SSU rRNA)基因是生命各个领域分子生态学中的关键标志物,但元基因组组装基因组通常是环境微生物唯一可用的资源,在很大程度上没有这种基因。在这里,我们介绍phyloFlash,它是一种以快速,以SSU rRNA为中心的分类学分类,针对性组装以及基于图的完整宏基因组学组装方法来克服此缺口的管道。我们显示,即使使用精选的参考数据库,对工件的清理也是至关重要的。有了这样一个经过过滤的数据库,通用映射器BBmap提取SSU rRNA的读取速度比rRNA专业工具SortMeRNA快5倍,并且在模拟元基因组上具有相似的灵敏度和更高的选择性。基于参考的有针对性的汇编程序产生了高度分散的程序集或高度的嵌合体,因此,我们采用了通用基因组组装程序SPAdes。我们的优化实现独立于参考数据库的组成,并具有令人满意的嵌合体形成水平。phyloFlash快速处理Illumina(元)基因组数据,易于使用,甚至是高通量质量控制的一部分,并且具有用户友好的输出报告。该软件可从https://github.com/HRGV/phyloFlash(GPL3许可证)获得,并通过在线手册进行记录。
更新日期:2020-10-28
down
wechat
bug