当前位置: X-MOL 学术PLOS Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PhyloFisher: A phylogenomic package for resolving eukaryotic relationships.
PLOS Biology ( IF 7.8 ) Pub Date : 2021-08-06 , DOI: 10.1371/journal.pbio.3001365
Alexander K Tice 1, 2 , David Žihala 3 , Tomáš Pánek 1, 3 , Robert E Jones 1, 2 , Eric D Salomaki 4 , Serafim Nenarokov 4 , Fabien Burki 5, 6 , Marek Eliáš 3 , Laura Eme 7 , Andrew J Roger 8 , Antonis Rokas 9 , Xing-Xing Shen 10 , Jürgen F H Strassert 5, 11 , Martin Kolísko 4, 12 , Matthew W Brown 1, 2
Affiliation  

Phylogenomic analyses of hundreds of protein-coding genes aimed at resolving phylogenetic relationships is now a common practice. However, no software currently exists that includes tools for dataset construction and subsequent analysis with diverse validation strategies to assess robustness. Furthermore, there are no publicly available high-quality curated databases designed to assess deep (>100 million years) relationships in the tree of eukaryotes. To address these issues, we developed an easy-to-use software package, PhyloFisher (https://github.com/TheBrownLab/PhyloFisher), written in Python 3. PhyloFisher includes a manually curated database of 240 protein-coding genes from 304 eukaryotic taxa covering known eukaryotic diversity, a novel tool for ortholog selection, and utilities that will perform diverse analyses required by state-of-the-art phylogenomic investigations. Through phylogenetic reconstructions of the tree of eukaryotes and of the Saccharomycetaceae clade of budding yeasts, we demonstrate the utility of the PhyloFisher workflow and the provided starting database to address phylogenetic questions across a large range of evolutionary time points for diverse groups of organisms. We also demonstrate that undetected paralogy can remain in phylogenomic "single-copy orthogroup" datasets constructed using widely accepted methods such as all vs. all BLAST searches followed by Markov Cluster Algorithm (MCL) clustering and application of automated tree pruning algorithms. Finally, we show how the PhyloFisher workflow helps detect inadvertent paralog inclusions, allowing the user to make more informed decisions regarding orthology assignments, leading to a more accurate final dataset.

中文翻译:

PhyloFisher:用于解析真核生物关系的系统基因包。

旨在解决系统发育关系的数百个蛋白质编码基因的系统发育分析现在是一种常见的做法。但是,目前还没有包含用于数据集构建和后续分析的工具的软件,这些工具具有多种验证策略以评估稳健性。此外,没有旨在评估真核生物树中深层(> 1 亿年)关系的公开可用的高质量精选数据库。为了解决这些问题,我们开发了一个易于使用的软件包 PhyloFisher (https://github.com/TheBrownLab/PhyloFisher),用 Python 3 编写。PhyloFisher 包括一个手动管理的数据库,其中包含来自 304涵盖已知真核生物多样性的真核生物分类群,一种用于直向同源物选择的新工具,和实用程序将执行最先进的系统发育研究所需的各种分析。通过真核生物树和芽殖酵母的 Saccharomycetaceae 进化枝的系统发育重建,我们展示了 PhyloFisher 工作流程和提供的起始数据库的效用,以解决不同生物群的大范围进化时间点的系统发育问题。我们还证明,未检测到的并行可以保留在使用广泛接受的方法构建的系统发育“单拷贝正群”数据集中,例如所有与所有 BLAST 搜索,然后是马尔可夫聚类算法 (MCL) 聚类和自动剪枝算法的应用。最后,我们展示了 PhyloFisher 工作流程如何帮助检测无意中的旁系同源物,
更新日期:2021-08-06
down
wechat
bug