当前位置: X-MOL 学术bioRxiv. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Phylogenetic placement of short reads without sequence alignment
bioRxiv - Bioinformatics Pub Date : 2020-10-19 , DOI: 10.1101/2020.10.19.344986
Matthias Blanke , Burkhard Morgenstern

Phylogenetic placement is the task of placing a query sequence of unknown taxonomic origin into a given phylogenetic tree of a set of reference sequences. Several approaches to phylogenetic placement have been proposed in recent years. The most accurate of them need a multiple alignment of the reference sequences as input. Most of them also need alignments of the query sequences to the multiple alignment of the reference sequences. A major field of application of phylogenetic placement is taxonomic read assignment in metagenomics. Herein, we propose App-SpaM, an efficient alignment-free algorithm for phylogenetic placement of short sequencing reads on a tree of a set of reference genomes. App-SpaM is based on the Filtered Spaced Word Matches approach that we previously developed. Unlike other methods, our approach neither requires a multiple alignment of the reference genomes, nor alignments of the queries to the reference sequences. Moreover, App-SpaM works not only on assembled reference genomes, but can also take reference taxa as input for which only unassembled read sequences are available. The quality of the results achieved with App-SpaM is comparable to the best available approaches to phylogenetic placement. However, since App-SpaM is not based on sequence alignment, it is between one and two orders of magnitude faster than those existing methods.

中文翻译:

系统发生短序列阅读而无序列比对

系统发生放置是将未知分类学来源的查询序列放置到一组参考序列的给定系统树中的任务。近年来,已经提出了几种系统发育放置的方法。它们中最精确的需要参考序列的多重比对作为输入。它们中的大多数还需要查询序列与参考序列的多重比对的比对。系统发育定位的主要应用领域是宏基因组学中的分类学读分配。在这里,我们提出App-SpaM,这是一种有效的无比对算法,用于在一组参考基因组的树上进行短测序序列的系统发育定位。App-SpaM基于我们先前开发的“过滤间隔字匹配”方法。与其他方法不同 我们的方法既不需要参考基因组的多重比对,也不需要查询与参考序列的比对。而且,App-SpaM不仅可以在组装的参考基因组上工作,而且还可以将参考分类单元作为输入,只有非组装的阅读序列可用。App-SpaM所获得的结果质量可与系统发育放置的最佳方法相媲美。但是,由于App-SpaM不是基于序列比对的,因此比现有方法快一到两个数量级。App-SpaM所获得的结果质量可与系统发育放置的最佳方法相媲美。但是,由于App-SpaM不是基于序列比对的,因此比现有方法快一到两个数量级。App-SpaM所获得的结果质量可与系统发育放置的最佳方法相媲美。但是,由于App-SpaM不是基于序列比对的,因此比现有方法快一到两个数量级。
更新日期:2020-10-20
down
wechat
bug