当前位置: X-MOL 学术BMC Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA
BMC Genomics ( IF 3.5 ) Pub Date : 2020-03-30 , DOI: 10.1186/s12864-020-6662-5
Hiro Takahashi , Noriya Hayashi , Yuta Hiragori , Shun Sasaki , Taichiro Motomura , Yui Yamashita , Satoshi Naito , Anna Takahashi , Kazuyuki Fuse , Kenji Satou , Toshinori Endo , Shoko Kojima , Hitoshi Onouchi

Upstream open reading frames (uORFs) in the 5′-untranslated regions (5′-UTRs) of certain eukaryotic mRNAs encode evolutionarily conserved functional peptides, such as cis-acting regulatory peptides that control translation of downstream main ORFs (mORFs). For genome-wide searches for uORFs with conserved peptide sequences (CPuORFs), comparative genomic studies have been conducted, in which uORF sequences were compared between selected species. To increase chances of identifying CPuORFs, we previously developed an approach in which uORF sequences were compared using BLAST between Arabidopsis and any other plant species with available transcript sequence databases. If this approach is applied to multiple plant species belonging to phylogenetically distant clades, it is expected to further comprehensively identify CPuORFs conserved in various plant lineages, including those conserved among relatively small taxonomic groups. To efficiently compare uORF sequences among many species and efficiently identify CPuORFs conserved in various taxonomic lineages, we developed a novel pipeline, ESUCA. We applied ESUCA to the genomes of five angiosperm species, which belong to phylogenetically distant clades, and selected CPuORFs conserved among at least three different orders. Through these analyses, we identified 89 novel CPuORF families. As expected, ESUCA analysis of each of the five angiosperm genomes identified many CPuORFs that were not identified from ESUCA analyses of the other four species. However, unexpectedly, these CPuORFs include those conserved across wide taxonomic ranges, indicating that the approach used here is useful not only for comprehensive identification of narrowly conserved CPuORFs but also for that of widely conserved CPuORFs. Examination of the effects of 11 selected CPuORFs on mORF translation revealed that CPuORFs conserved only in relatively narrow taxonomic ranges can have sequence-dependent regulatory effects, suggesting that most of the identified CPuORFs are conserved because of functional constraints of their encoded peptides. This study demonstrates that ESUCA is capable of efficiently identifying CPuORFs likely to be conserved because of the functional importance of their encoded peptides. Furthermore, our data show that the approach in which uORF sequences from multiple species are compared with those of many other species, using ESUCA, is highly effective in comprehensively identifying CPuORFs conserved in various taxonomic ranges.

中文翻译:

使用新型管道ESUCA,利用在各种分类范围内保守的肽序列,对被子植物上游ORF进行全基因组范围的全面鉴定

某些真核mRNA的5'-非翻译区(5'-UTR)中的上游开放阅读框(uORF)编码进化上保守的功能性肽,例如控制下游主要ORF(mORF)翻译的顺式作用调节肽。为了在全基因组范围内搜索具有保守肽序列(CPuORF)的uORF,已进行了比较基因组研究,其中比较了所选物种之间的uORF序列。为了增加鉴定CPuORF的机会,我们先前开发了一种方法,其中使用BLAST在拟南芥属和任何其他植物物种之间使用可用的转录序列数据库比较uORF序列。如果将此方法应用于系统发育距离较远的进化枝的多种植物,有望进一步全面鉴定在各种植物谱系中保守的CPuORF,包括在相对较小的分类学组中保守的CPuORF。为了有效地比较许多物种之间的uORF序列并有效地鉴定在各种分类谱系中保守的CPuORF,我们开发了一种新型管道ESUCA。我们将ESUCA应用于五个被子植物种类的基因组,这些植物属于系统发育距离较远的进化枝,并选择了在至少三个不同顺序中保守的CPuORF。通过这些分析,我们确定了89个新颖的CPuORF家族。不出所料,对五个被子植物基因组进行的ESUCA分析确定了许多CPuORF,而这些CPuORF从其他四个物种的ESUCA分析中未发现。但是,出乎意料的是,这些CPuORF包括在广泛的生物分类学范围内均保守的CPuORF,这表明此处使用的方法不仅可用于全面识别狭义保守的CPuORF,而且还可用于广泛保守的CPuORF。检查11个选定的CPuORF对mORF翻译的影响发现,仅在相对狭窄的分类学范围内保守的CPuORF可以具有序列依赖性调节作用,这表明大多数已鉴定的CPuORF由于其编码肽的功能限制而被保守。这项研究表明,ESUCA能够有效识别可能被保守的CPuORF,因为它们编码的肽具有功能重要性。此外,我们的数据表明,使用ESUCA将多个物种的uORF序列与许多其他物种的uORF序列进行比较的方法,
更新日期:2020-03-31
down
wechat
bug