当前位置: X-MOL 学术bioRxiv. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Conserved long-range base pairings associated with pre-mRNA processing of human protein-coding genes
bioRxiv - Bioinformatics Pub Date : 2021-01-20 , DOI: 10.1101/2020.05.05.076927
Svetlana Kalmykova , Marina Kalinina , Stepan Denisov , Alexey Mironov , Dmitry Skvortsov , Roderic Guigó , Dmitri Pervouchine

The ability of nucleic acids to form double-stranded structures is essential for all living systems on Earth. While DNA employs it for genome replication, RNA molecules fold into complicated secondary and tertiary structures. Current knowledge on functional RNA structures in human protein-coding genes is focused on locally-occurring base pairs. However, chemical crosslinking and proximity ligation experiments have demonstrated that long-range RNA structures are highly abundant. Here, we present the most complete to-date catalog of conserved long-range RNA structures in the human transcriptome, which consists of 916,360 pairs of conserved complementary regions (PCCRs). PCCRs tend to occur within introns proximally to splice sites, suppress intervening exons, circumscribe circular RNAs, and exert an obstructive effect on cryptic and inactive splice sites. The double-stranded structure of PCCRs is supported by a significant decrease of icSHAPE nucleotide accessibility, high abundance of A-to-I RNA editing sites, and frequent occurrence of forked eCLIP peaks nearby. Introns with PCCRs show a distinct splicing pattern in response to RNA Pol II slowdown suggesting that splicing is widely affected by co-transcriptional RNA folding. Additionally, transcript starts and ends are strongly enriched in regions between complementary parts of PCCRs, leading to an intriguing hypothesis that RNA folding coupled with splicing could mediate co-transcriptional suppression of premature cleavage and polyadenylation events. PCCR detection procedure is highly sensitive with respect to bona fide validated RNA structures at the expense of having a high false positive rate, which cannot be reduced without loss of sensitivity. The catalog of PCCRs is visualized through a UCSC Genome Browser track hub.

中文翻译:

与人类蛋白编码基因的预mRNA加工相关的保守的远程碱基配对

核酸形成双链结构的能力对于地球上所有生命系统都是必不可少的。当DNA将其用于基因组复制时,RNA分子会折叠成复杂的二级和三级结构。关于人类蛋白质编码基因中功能性RNA结构的当前知识集中于局部存在的碱基对。但是,化学交联和邻近连接实验表明,远程RNA结构高度丰富。在这里,我们介绍了人类转录组中保守的远程RNA结构最完整的目录,其中包括916,360对保守互补区(PCCR)。PCCR倾向于在内含子内发生在剪接位点的近端,抑制插入的外显子,限制环状RNA,对隐秘和非活性剪接位点产生阻碍作用。icSHAPE核苷酸可及性的显着降低,A-to-I RNA编辑位点的丰富性以及附近频繁出现的分叉eCLIP峰为PCCR的双链结构提供了支持。具有PCCRs的内含子在响应RNA Pol II减慢时显示出独特的剪接模式,表明剪接受到共转录RNA折叠的广泛影响。此外,转录起始和终止在PCCR互补部分之间的区域中非常丰富,导致了一个有趣的假设,即RNA折叠与剪接结合可以介导共转录抑制早熟切割和聚腺苷酸化事件。PCCR检测程序对真实验证的RNA结构高度敏感,但代价是具有较高的假阳性率,而假阳性率很高,因此不能降低灵敏度。PCCR的目录通过UCSC Genome Browser跟踪中心可视化。
更新日期:2021-01-21
down
wechat
bug