当前位置: X-MOL 学术J. Mol. Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila.
Journal of Molecular Evolution ( IF 3.9 ) Pub Date : 2020-04-07 , DOI: 10.1007/s00239-020-09939-z
Brennen Heames 1 , Jonathan Schmitz 1 , Erich Bornberg-Bauer 1
Affiliation  

Orphan genes, lacking detectable homologs in outgroup species, typically represent 10-30% of eukaryotic genomes. Efforts to find the source of these young genes indicate that de novo emergence from non-coding DNA may in part explain their prevalence. Here, we investigate the roots of orphan gene emergence in the Drosophila genus. Across the annotated proteomes of twelve species, we find 6297 orphan genes within 4953 taxon-specific clusters of orthologs. By inferring the ancestral DNA as non-coding for between 550 and 2467 (8.7-39.2%) of these genes, we describe for the first time how de novo emergence contributes to the abundance of clade-specific Drosophila genes. In support of them having functional roles, we show that de novo genes have robust expression and translational support. However, the distinct nucleotide sequences of de novo genes, which have characteristics intermediate between intergenic regions and conserved genes, reflect their recent birth from non-coding DNA. We find that de novo genes encode more disordered proteins than both older genes and intergenic regions. Together, our results suggest that gene emergence from non-coding DNA provides an abundant source of material for the evolution of new proteins. Following gene birth, gradual evolution over large evolutionary timescales moulds sequence properties towards those of conserved genes, resulting in a continuum of properties whose starting points depend on the nucleotide sequences of an initial pool of novel genes.

中文翻译:

不断进化的从头基因驱动果蝇蛋白质编码的新颖性。

孤儿基因在外群物种中缺乏可检测的同源物,通常占真核基因组的 10-30%。寻找这些年轻基因来源的努力表明,非编码 DNA 从头出现可能部分解释了它们的流行。在这里,我们调查了果蝇属中孤儿基因出现的根源。在 12 个物种的注释蛋白质组中,我们在 4953 个特定于分类群的直向同源物簇中发现了 6297 个孤儿基因。通过将祖先 DNA 推断为这些基因中 550 到 2467 个(8.7-39.2%)之间的非编码,我们首次描述了从头出现如何导致特定于进化枝的果蝇基因的丰度。为了支持它们具有功能作用,我们表明从头基因具有强大的表达和翻译支持。然而,从头基因的不同核苷酸序列,它们具有介于基因间区域和保守基因之间的特征,反映了它们最近从非编码 DNA 中诞生。我们发现从头基因编码的无序蛋白质比旧基因和基因间区域更多。总之,我们的结果表明,从非编码 DNA 中出现的基因为新蛋白质的进化提供了丰富的材料来源。基因诞生后,在大进化时间尺度上的逐渐进化将序列特性塑造成保守基因的序列特性,从而产生一系列特性,其起点取决于新基因初始库的核苷酸序列。我们发现从头基因编码的无序蛋白质比旧基因和基因间区域更多。总之,我们的结果表明,从非编码 DNA 中出现的基因为新蛋白质的进化提供了丰富的材料来源。基因诞生后,在大进化时间尺度上的逐渐进化将序列特性塑造成保守基因的序列特性,从而产生一系列特性,其起点取决于新基因初始库的核苷酸序列。我们发现 de novo 基因比旧基因和基因间区域编码更多的无序蛋白质。总之,我们的结果表明,从非编码 DNA 中出现的基因为新蛋白质的进化提供了丰富的材料来源。基因诞生后,在大进化时间尺度上的逐渐进化将序列特性塑造成保守基因的序列特性,从而产生一系列特性,其起点取决于新基因初始库的核苷酸序列。
更新日期:2020-04-07
down
wechat
bug