当前位置: X-MOL 学术Avian Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Whole genome phylogeny of Gallus: introgression and data-type effects
Avian Research ( IF 1.8 ) Pub Date : 2020-03-17 , DOI: 10.1186/s40657-020-00194-w
George P. Tiley , Akanksha Pandey , Rebecca T. Kimball , Edward L. Braun , J. Gordon Burleigh

Previous phylogenetic studies that include the four recognized species of Gallus have resulted in a number of distinct topologies, with little agreement. Several factors could lead to the failure to converge on a consistent topology, including introgression, incomplete lineage sorting, different data types, or insufficient data. We generated three novel whole genome assemblies for Gallus species, which we combined with data from the published genomes of Gallus gallus and Bambusicola thoracicus (a member of the sister genus to Gallus). To determine why previous studies have failed to converge on a single topology, we extracted large numbers of orthologous exons, introns, ultra-conserved elements, and conserved non-exonic elements from the genome assemblies. This provided more than 32 million base pairs of data that we used for concatenated maximum likelihood and multispecies coalescent analyses of Gallus. All of our analyses, regardless of data type, yielded a single, well-supported topology. We found some evidence for ancient introgression involving specific Gallus lineages as well as modest data type effects that had an impact on support and branch length estimates in specific analyses. However, the estimated gene tree spectra for all data types had a relatively good fit to their expectation given the multispecies coalescent. Overall, our data suggest that conflicts among previous studies probably reflect the use of smaller datasets (both in terms of number of sites and of loci) in those analyses. Our results demonstrate the importance of sampling large numbers of loci, each of which has a sufficient number of sites to provide robust estimates of gene trees. Low-coverage whole genome sequencing, as we did here, represents a cost-effective means to generate the very large data sets that include multiple data types that enabled us to obtain a robust estimate of Gallus phylogeny.

中文翻译:

鸡属的全基因组系统发育:基因渗入和数据类型效应

以前的系统发育研究包括四个公认的捷拉斯物种,导致了许多不同的拓扑结构,但几乎没有一致。多种因素可能导致无法收敛于一致的拓扑,包括渗入,谱系排序不完整,数据类型不同或数据不足。我们为盖氏菌种生成了三个新颖的全基因组装配体,并与已发表的盖氏菌和Bambusicola thoracicus(盖氏菌属的一个成员)的基因组数据相结合。为了确定为什么先前的研究未能收敛于单个拓扑,我们从基因组组装中提取了大量直系同源外显子,内含子,超保守元件和保守非外显子元件。这提供了超过3200万个基本数据对,我们将其用于连接最大似然和捷拉斯的多物种合并分析。无论数据类型如何,我们所有的分析都产生了一个单一的,得到良好支持的拓扑。我们发现了一些证据,证明古老的基因渗入涉及特定的Gallus谱系以及适度的数据类型效应,这些效应对特定分析中的支持度和分支长度估计产生了影响。但是,考虑到多物种合并,所有数据类型的估计基因树谱都相对符合他们的预期。总体而言,我们的数据表明,先前研究之间的冲突可能反映了在那些分析中使用较小的数据集(无论是位点数还是位点数)。我们的结果证明了对大量基因座进行采样的重要性,每个都有足够数量的位点来提供对基因树的可靠估计。就像我们在这里所做的那样,低覆盖范围的全基因组测序代表了一种经济高效的方法,可以生成包含多种数据类型的超大型数据集,从而使我们能够获得对盖洛斯系统发生率的可靠估计。
更新日期:2020-03-17
down
wechat
bug