当前位置: X-MOL 学术Gigascience › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prot-SpaM: fast alignment-free phylogeny reconstruction based on whole-proteome sequences.
GigaScience ( IF 11.8 ) Pub Date : 2018-12-12 , DOI: 10.1093/gigascience/giy148
Chris-Andre Leimeister 1 , Jendrik Schellhorn 1 , Svenja Dörrer 1 , Michael Gerth 2 , Christoph Bleidorn 3, 4 , Burkhard Morgenstern 1, 5
Affiliation  

Word-based or 'alignment-free' sequence comparison has become an active research area in bioinformatics. While previous word-frequency approaches calculated rough measures of sequence similarity or dissimilarity, some new alignment-free methods are able to accurately estimate phylogenetic distances between genomic sequences. One of these approaches is Filtered Spaced Word Matches. Here, we extend this approach to estimate evolutionary distances between complete or incomplete proteomes; our implementation of this approach is called Prot-SpaM. We compare the performance of Prot-SpaM to other alignment-free methods on simulated sequences and on various groups of eukaryotic and prokaryotic taxa. Prot-SpaM can be used to calculate high-quality phylogenetic trees for dozens of whole-proteome sequences in a matter of seconds or minutes and often outperforms other alignment-free approaches. The source code of our software is available through Github: https://github.com/jschellh/ProtSpaM.

中文翻译:

Prot-SpaM:基于全蛋白质组序列的快速免比对系统发育重建。

基于单词或“免比对”序列比较已成为生物信息学中的一个活跃研究领域。虽然以前的词频方法计算了序列相似性或相异性的粗略测量,但一些新的免比对方法能够准确估计基因组序列之间的系统发育距离。其中一种方法是过滤间隔单词匹配。在这里,我们扩展了这种方法来估计完整或不完整蛋白质组之间的进化距离;我们对这种方法的实施称为 Prot-SpaM。我们将 Prot-SpaM 与其他免比对方法在模拟序列以及各组真核和原核类群上的性能进行比较。Prot-SpaM 可用于在几秒或几分钟内计算数十个全蛋白质组序列的高质量系统发育树,并且通常优于其他免比对方法。我们软件的源代码可通过 Github 获取:https://github.com/jschellh/ProtSpaM。
更新日期:2018-12-07
down
wechat
bug