当前位置: X-MOL 学术Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Analysis of Paralogs in Target Enrichment Data Pinpoints Multiple Ancient Polyploidy Events in Alchemilla s.l. (Rosaceae)
Systematic Biology ( IF 6.5 ) Pub Date : 2021-05-05 , DOI: 10.1093/sysbio/syab032
Diego F Morales-Briones 1, 2 , Berit Gehrke 3 , Chien-Hsun Huang 4 , Aaron Liston 5 , Hong Ma 6 , Hannah E Marx 7, 8 , David C Tank 2 , Ya Yang 1
Affiliation  

Target enrichment is becoming increasingly popular for phylogenomic studies. Although baits for enrichment are typically designed to target single-copy genes, paralogs are often recovered with increased sequencing depth, sometimes from a significant proportion of loci, especially in groups experiencing whole-genome duplication (WGD) events. Common approaches for processing paralogs in target enrichment data sets include random selection, manual pruning, and mainly, the removal of entire genes that show any evidence of paralogy. These approaches are prone to errors in orthology inference or removing large numbers of genes. By removing entire genes, valuable information that could be used to detect and place WGD events is discarded. Here, we used an automated approach for orthology inference in a target enrichment data set of 68 species of Alchemilla s.l. (Rosaceae), a widely distributed clade of plants primarily from temperate climate regions. Previous molecular phylogenetic studies and chromosome numbers both suggested ancient WGDs in the group. However, both the phylogenetic location and putative parental lineages of these WGD events remain unknown. By taking paralogs into consideration and inferring orthologs from target enrichment data, we identified four nodes in the backbone of Alchemilla s.l. with an elevated proportion of gene duplication. Furthermore, using a gene-tree reconciliation approach, we established the autopolyploid origin of the entire Alchemilla s.l. and the nested allopolyploid origin of four major clades within the group. Here, we showed the utility of automated tree-based orthology inference methods, previously designed for genomic or transcriptomic data sets, to study complex scenarios of polyploidy and reticulate evolution from target enrichment data sets.[Alchemilla; allopolyploidy; autopolyploidy; gene tree discordance; orthology inference; paralogs; Rosaceae; target enrichment; whole genome duplication.]

中文翻译:

目标富集数据中的旁系同源物分析精确定位 Alchemilla sl(蔷薇科)中的多个古代多倍体事件

目标富集在系统基因组学研究中变得越来越流行。尽管用于富集的诱饵通常设计为针对单拷贝基因,但旁系同源物通常会随着测序深度的增加而恢复,有时会从相当大比例的基因座中恢复,尤其是在经历全基因组复制 (WGD) 事件的群体中。在目标富集数据集中处理 paralogs 的常用方法包括随机选择、手动修剪,主要是去除显示任何 paralog 证据的整个基因。这些方法在直系推理或删除大量基因时容易出错。通过删除整个基因,可用于检测和放置 WGD 事件的有价值信息被丢弃。这里,我们在 68 种 Alchemilla sl(蔷薇科)的目标富集数据集中使用了一种自动化方法进行正交推断,这是一种广泛分布的植物进化枝,主要来自温带气候地区。以前的分子系统发育研究和染色体数量都表明该组中存在古老的WGD。然而,这些 WGD 事件的系统发育位置和假定的亲本谱系仍然未知。通过考虑旁系同源物并从目标富集数据推断直系同源物,我们确定了 Alchemilla sl 骨干中的四个节点,其基因重复比例升高。此外,使用基因树协调方法,我们确定了整个 Alchemilla sl 的同源多倍体起源和组内四个主要进化枝的嵌套异源多倍体起源。这里,我们展示了以前为基因组或转录组数据集设计的基于树的自动正交推理方法的实用性,用于研究多倍体的复杂场景和目标富集数据集的网状进化。异源多倍体;同源多倍体;基因树不一致;直系推理;旁系同源物;蔷薇科; 目标浓缩;全基因组复制。]
更新日期:2021-05-05
down
wechat
bug