当前位置: X-MOL 学术Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Frequency and Topology of Pseudoorthologs
Systematic Biology ( IF 6.1 ) Pub Date : 2021-12-20 , DOI: 10.1093/sysbio/syab097
Megan L Smith 1 , Matthew W Hahn 1
Affiliation  

Phylogenetics has long relied on the use of orthologs, or genes related through speciation events, to infer species relationships. However, identifying orthologs is difficult because gene duplication can obscure relationships among genes. Researchers have been particularly concerned with the insidious effects of pseudoorthologs—duplicated genes that are mistaken for orthologs because they are present in a single copy in each sampled species. Because gene tree topologies of pseudoorthologs may differ from the species tree topology, they have often been invoked as the cause of counterintuitive results in phylogenetics. Despite these perceived problems, no previous work has calculated the probabilities of pseudoortholog topologies or has been able to circumscribe the regions of parameter space in which pseudoorthologs are most likely to occur. Here, we introduce a model for calculating the probabilities and branch lengths of orthologs and pseudoorthologs, including concordant and discordant pseudoortholog topologies, on a rooted three-taxon species tree. We show that the probability of orthologs is high relative to the probability of pseudoorthologs across reasonable regions of parameter space. Furthermore, the probabilities of the two discordant topologies are equal and never exceed that of the concordant topology, generally being much lower. We describe the species tree topologies most prone to generating pseudoorthologs, finding that they are likely to present problems to phylogenetic inference irrespective of the presence of pseudoorthologs. Overall, our results suggest that pseudoorthologs are unlikely to mislead inferences of species relationships under the biological scenarios considered here.[Birth–death model; orthologs; paralogs; phylogenetics.]

中文翻译:

伪直系同源物的频率和拓扑

系统发育学长期以来一直依赖于使用直系同源物或通过物种形成事件相关的基因来推断物种关系。然而,识别直系同源物很困难,因为基因重复会掩盖基因之间的关系。研究人员特别关注假直系同源物的隐蔽影响——被误认为是直系同源物的重复基因,因为它们在每个采样物种中都存在于单个副本中。由于伪直系同源物的基因树拓扑可能与物种树拓扑不同,因此它们经常被引用为系统发育学中违反直觉的结果的原因。尽管存在这些可察觉的问题,但以前没有任何工作计算过伪正交拓扑的概率,或者能够限制最可能出现伪正交的参数空间区域。这里,我们引入了一个模型,用于在有根的三分类物种树上计算直向同源物和伪直向同源物的概率和分支长度,包括一致和不一致的伪直向同源物拓扑。我们表明,在参数空间的合理区域中,直系同源物的概率相对于伪直系同源物的概率较高。此外,两种不一致拓扑的概率相等,并且永远不会超过一致拓扑的概率,通常要低得多。我们描述了最容易产生伪直系同源物的物种树拓扑,发现无论是否存在假直系同源物,它们都可能给系统发育推断带来问题。总体,我们的结果表明,在此处考虑的生物学情景下,伪直系同源物不太可能误导对物种关系的推断。直系同源物;旁系同源物;系统发育学。]
更新日期:2021-12-20
down
wechat
bug