当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Quantitative RNA-seq meta-analysis of alternative exon usage in C. elegans
Genome Research ( IF 6.2 ) Pub Date : 2017-12-01 , DOI: 10.1101/gr.224626.117
Nicolas J. Tourasse , Jonathan R.M. Millet , Denis Dupuy

Almost 20 years after the completion of the C. elegans genome sequence, gene structure annotation is still an ongoing process with new evidence for gene variants still being regularly uncovered by additional in-depth transcriptome studies. While alternative splice forms can allow a single gene to encode several functional isoforms, the question of how much spurious splicing is tolerated is still heavily debated. Here we gathered a compendium of 1682 publicly available C. elegans RNA-seq data sets to increase the dynamic range of detection of RNA isoforms, and obtained robust measurements of the relative abundance of each splicing event. While most of the splicing reads come from reproducibly detected splicing events, a large fraction of purported junctions is only supported by a very low number of reads. We devised an automated curation method that takes into account the expression level of each gene to discriminate robust splicing events from potential biological noise. We found that rarely used splice sites disproportionately come from highly expressed genes and are significantly less conserved in other nematode genomes than splice sites with a higher usage frequency. Our increased detection power confirmed trans-splicing for at least 84% of C. elegans protein coding genes. The genes for which trans-splicing was not observed are overwhelmingly low expression genes, suggesting that the mechanism is pervasive but not fully captured by organism-wide RNA-seq. We generated annotated gene models including quantitative exon usage information for the entire C. elegans genome. This allows users to visualize at a glance the relative expression of each isoform for their gene of interest.



中文翻译:

秀丽隐杆线虫替代外显子用法的定量RNA-seq荟萃分析

秀丽隐杆线虫基因组序列完成将近20年后,基因结构注释仍是一个持续的过程,有关基因变异的新证据仍被其他深入的转录组研究定期发现。尽管替代的剪接形式可以使单个基因编码几种功能同工型,但对于容许多少杂种剪接的问题仍存在很多争议。在这里,我们收集了1682条公开秀丽隐杆线虫的纲要RNA-seq数据集增加了RNA同工型检测的动态范围,并获得了每个剪接事件相对丰度的可靠测量结果。虽然大多数剪接读取来自可重复检测的剪接事件,但大部分声称的连接仅由极少量的读取支持。我们设计了一种自动策展方法,该方法考虑了每个基因的表达水平,以区分强大的剪接事件与潜在的生物噪声。我们发现,很少使用的剪接位点不成比例地来自高度表达的基因,并且在其他线虫基因组中的保守性明显低于使用频率较高的剪接位点。我们提高的检测能力证实了至少84%的秀丽隐杆线虫的反式剪接蛋白质编码基因。没有观察到反式剪接的基因是绝大多数的低表达基因,表明该机制是普遍存在的,但不能被整个生物体的RNA-seq完全捕获。我们生成了带注释的基因模型,其中包括整个秀丽隐杆线虫基因组的定量外显子使用信息。这样一来,用户就可以一目了然地查看其感兴趣基因的每种同工型的相对表达。

更新日期:2017-12-01
down
wechat
bug