当前位置: X-MOL 学术Genom. Proteom. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling
Genomics, Proteomics & Bioinformatics ( IF 9.5 ) Pub Date : 2021-09-15 , DOI: 10.1016/j.gpb.2021.09.002
Yanyan Li 1 , Honghong Zhou 2 , Xiaomin Chen 3 , Yu Zheng 1 , Quan Kang 2 , Di Hao 2 , Lili Zhang 3 , Tingrui Song 2 , Huaxia Luo 2 , Yajing Hao 4 , Runsheng Chen 5 , Peng Zhang 2 , Shunmin He 1
Affiliation  

Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames (sORFs), which were usually missed in previous genome annotation. The significance of small proteins has been revealed in current years, along with the discovery of their diverse functions. However, systematic annotation of small proteins is still insufficient. SmProt was specially developed to provide valuable information on small proteins for scientific community. Here we present the update of SmProt, which emphasizes reliability of translated sORFs, genetic variants in translated sORFs, disease-specific sORF translation events or sequences, and remarkably increased data volume. More components such as non-ATG translation initiation, function, and new sources are also included. SmProt incorporated 638,958 unique small proteins curated from 3,165,229 primary records, which were computationally predicted from 419 ribosome profiling (Ribo-seq) datasets or collected from literature and other sources from 370 cell lines or tissues in 8 species (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Danio rerio, Saccharomyces cerevisiae, Caenorhabditis elegans, and Escherichia coli). In addition, small protein families identified from human microbiomes were also collected. All datasets in SmProt are free to access, and available for browse, search, and bulk downloads at http://bigdata.ibp.ac.cn/SmProt/.



中文翻译:

SmProt:一个可靠的存储库,对从核糖体分析鉴定的小蛋白质进行全面注释

小蛋白质特指由从小的开放阅读框(sORF) 翻译而来的少于 100 个氨基酸组成的蛋白质,这些氨基酸在以前的基因组注释中通常被遗漏。近年来,随着人们发现小蛋白的多种功能,人们已经揭示了小蛋白的重要性。然而,小蛋白质的系统注释仍然不足。SmProt 是专门为科学界提供有关小蛋白质的宝贵信息而开发的。在这里,我们介绍了 SmProt 的更新,它强调翻译的 sORF 的可靠性、翻译的 sORF 中的遗传变异疾病- 特定的 sORF 翻译事件或序列,以及显着增加的数据量。还包括更多的组件,例如非 ATG 翻译起始、功能和新来源。SmProt 整合了来自 3,165,229 个主要记录的 638,958 个独特的小蛋白质,这些小蛋白质是从 419个核糖体分析(Ribo-seq) 数据集计算预测的,或者从文献和其他来源收集的 8 个物种(智人小家鼠家鼠)的 370 个细胞系或组织中norvegicus , Drosophila melanogaster , Danio rerio , Saccharomyces cerevisiae , Caenorhabditis elegansEscherichia coli)。此外,还收集了从人类微生物组中鉴定的小蛋白质家族。SmProt 中的所有数据集均可免费访问,并可在 http://bigdata.ibp.ac.cn/SmProt/ 上进行浏览、搜索和批量下载。

更新日期:2021-09-15
down
wechat
bug