当前位置: X-MOL 学术bioRxiv. Evol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GenOrigin: A Comprehensive Protein-coding Gene Origination Database on the Evolutionary Timescale of Life
bioRxiv - Evolutionary Biology Pub Date : 2020-10-17 , DOI: 10.1101/2020.10.17.342022
Yi-Bo Tong , Meng-Wei Shi , Sheng Hu Qian , Yu-Jie Chen , Zhi-Hui Luo , Yi-Xuan Tu , Chunyan Chen , Zhen-Xia Chen

The origination of new genes contributes to the biological diversity of life. New genes may quickly build their own network in the genomes, exert important functions, and generate novel phenotypes. Dating gene age and inferring the origination mechanisms of new genes, like primate-specific gene, is the basis for the functional study of the genes. However, no comprehensive resource of gene age estimates across species is available. Here, we systematically dated the age of 9,102,113 protein-coding genes from 565 species in the Ensembl and Ensembl Genomes databases, including 82 bacteria, 57 protists, 134 fungi, 58 plants, 56 metazoa, and 178 vertebrates, using protein-family-based pipeline with Wagner parsimony algorithm. We also collected gene age estimate data from other studies and uniformed the gene age estimates to time ranges in million years for comparison across studies. All the data were cataloged into GenOrigin (http://genorigin.chenzxlab.cn/), a user-friendly new database of gene age estimates, where users can browse gene age estimates by species, age and gene ontology. In GenOrigin, the information such as gene age estimates, annotation, gene ontology, ortholog and paralog, as well as detailed gene presence/absence views for gene age inference based on the species tree with evolutionary timescale, was provided to researchers for exploring gene functions.

中文翻译:

GenOrigin:关于生命进化时间尺度的综合蛋白质编码基因起源数据库

新基因的起源有助于生命的生物多样性。新基因可以在基因组中快速建立自己的网络,发挥重要功能,并产生新的表型。约会基因年龄和推断新基因(如灵长类动物特异性基因)的起源机制是基因功能研究的基础。但是,没有可用的跨物种基因年龄估算的综合资源。在这里,我们使用基于蛋白质家族的方法系统地对了Ensembl和Ensembl基因组数据库中565个物种的9102113个蛋白质编码基因的年龄,包括82个细菌,57个原生生物,134个真菌,58种植物,56个后生动物和178个脊椎动物。 Wagner简约算法处理管道。我们还从其他研究中收集了基因年龄估计数据,并将基因年龄估计统一到百万年的时间范围,以便在各个研究之间进行比较。所有数据都被归类到GenOrigin(http://genorigin.chenzxlab.cn/)中,GenOrigin是一个易于使用的新的基因年龄估计数据库,用户可以在其中按物种,年龄和基因本体浏览基因年龄估计。在GenOrigin中,向研究人员提供了基因年龄估计,注释,基因本体论,直系同源物和旁系同源物等信息,以及基于具有进化时间表的物种树的详细的基因存在/不存在视图,以进行基因年龄推断。 。用户可以在其中浏览按物种,年龄和基因本体论进行的基因年龄估计。在GenOrigin中,向研究人员提供了基因年龄估计,注释,基因本体论,直系同源物和旁系同源物等信息,以及基于具有进化时间表的物种树的详细的基因存在/不存在视图,以进行基因年龄推断。 。用户可以在其中浏览按物种,年龄和基因本体论进行的基因年龄估计。在GenOrigin中,向研究人员提供了基因年龄估计,注释,基因本体论,直系同源物和旁系同源物等信息,以及基于具有进化时间表的物种树的详细的基因存在/不存在视图,以进行基因年龄推断。 。
更新日期:2020-10-17
down
wechat
bug