当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Benchmarking orthology methods using phylogenetic patterns defined at the base of Eukaryotes.
Briefings in Bioinformatics ( IF 9.5 ) Pub Date : 2021-05-20 , DOI: 10.1093/bib/bbaa206
Eva S Deutekom , Berend Snel , Teunis J P van Dam

Insights into the evolution of ancestral complexes and pathways are generally achieved through careful and time-intensive manual analysis often using phylogenetic profiles of the constituent proteins. This manual analysis limits the possibility of including more protein-complex components, repeating the analyses for updated genome sets or expanding the analyses to larger scales. Automated orthology inference should allow such large-scale analyses, but substantial differences between orthologous groups generated by different approaches are observed. We evaluate orthology methods for their ability to recapitulate a number of observations that have been made with regard to genome evolution in eukaryotes. Specifically, we investigate phylogenetic profile similarity (co-occurrence of complexes), the last eukaryotic common ancestor's gene content, pervasiveness of gene loss and the overlap with manually determined orthologous groups. Moreover, we compare the inferred orthologies to each other. We find that most orthology methods reconstruct a large last eukaryotic common ancestor, with substantial gene loss, and can predict interacting proteins reasonably well when applying phylogenetic co-occurrence. At the same time, derived orthologous groups show imperfect overlap with manually curated orthologous groups. There is no strong indication of which orthology method performs better than another on individual or all of these aspects. Counterintuitively, despite the orthology methods behaving similarly regarding large-scale evaluation, the obtained orthologous groups differ vastly from one another. Availability and implementation The data and code underlying this article are available in github and/or upon reasonable request to the corresponding author: https://github.com/ESDeutekom/ComparingOrthologies.

中文翻译:

使用在真核生物基础上定义的系统发育模式的基准直向学方法。

对祖先复合物和通路进化的深入了解通常是通过仔细和耗时的手动分析来实现的,这些分析通常使用组成蛋白的系统发育谱。这种手动分析限制了包含更多蛋白质复合物成分、重复分析更新的基因组集或将分析扩展到更大规模的可能性。自动直系推断应该允许进行这种大规模分析,但观察到由不同方法生成的直系同源组之间存在重大差异。我们评估直系同源方法的能力,以概括对真核生物基因组进化所做的大量观察。具体来说,我们研究了系统发育谱相似性(复合物的共存),最后一个真核共同祖先的基因含量,基因丢失的普遍性以及与手动确定的直系同源组的重叠。此外,我们将推断的正射学相互比较。我们发现大多数直系同源方法重建了一个大的最后一个真核共同祖先,具有大量的基因丢失,并且在应用系统发育共现时可以相当好地预测相互作用的蛋白质。同时,派生的直系同源组与手动策划的直系同源组显示出不完美的重叠。没有强有力的迹象表明哪种直向学方法在个人或所有这些方面比另一种表现更好。与直觉相反,尽管直系同源方法在大规模评估方面的行为相似,但获得的直系同源组彼此差异很大。
更新日期:2020-09-16
down
wechat
bug