当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PRAM: a novel pooling approach for discovering intergenic transcripts from large-scale RNA sequencing experiments.
Genome Research ( IF 7 ) Pub Date : 2020-11-01 , DOI: 10.1101/gr.252445.119
Peng Liu 1 , Alexandra A Soukup 2 , Emery H Bresnick 2 , Colin N Dewey 1, 3 , Sündüz Keleş 1, 4
Affiliation  

Publicly available RNA-seq data is routinely used for retrospective analysis to elucidate new biology. Novel transcript discovery enabled by joint analysis of large collections of RNA-seq data sets has emerged as one such analysis. Current methods for transcript discovery rely on a ‘2-Step’ approach where the first step encompasses building transcripts from individual data sets, followed by the second step that merges predicted transcripts across data sets. To increase the power of transcript discovery from large collections of RNA-seq data sets, we developed a novel ‘1-Step’ approach named Pooling RNA-seq and Assembling Models (PRAM) that builds transcript models from pooled RNA-seq data sets. We demonstrate in a computational benchmark that 1-Step outperforms 2-Step approaches in predicting overall transcript structures and individual splice junctions, while performing competitively in detecting exonic nucleotides. Applying PRAM to 30 human ENCODE RNA-seq data sets identified unannotated transcripts with epigenetic and RAMPAGE signatures similar to those of recently annotated transcripts. In a case study, we discovered and experimentally validated new transcripts through the application of PRAM to mouse hematopoietic RNA-seq data sets. We uncovered new transcripts that share a differential expression pattern with a neighboring gene Pik3cg implicated in human hematopoietic phenotypes, and we provided evidence for the conservation of this relationship in human. PRAM is implemented as an R/Bioconductor package.

中文翻译:

PRAM:一种用于从大规模 RNA 测序实验中发现基因间转录本的新型汇集方法。

公开可用的 RNA-seq 数据通常用于回顾性分析,以阐明新的生物学。通过对大量 RNA-seq 数据集的联合分析实现的新转录发现已成为此类分析之一。当前的转录本发现方法依赖于“两步”方法,其中第一步包括从单个数据集构建转录本,然后是跨数据集合并预测转录本的第二步。为了提高从大量 RNA-seq 数据集中发现转录本的能力,我们开发了一种名为池化 RNA-seq 和组装模型 (PRAM) 的新型“一步”方法,该方法从汇集的 RNA-seq 数据集构建转录本模型。我们在计算基准中证明,1-Step 在预测整体转录结构和单个剪接点方面优于 2-Step 方法,同时在检测外显子核苷酸方面具有竞争力。将 PRAM 应用于 30 个人类 ENCODE RNA-seq 数据集,识别出具有表观遗传和 RAMPAGE 特征的未注释转录本,与最近注释的转录本相似。在一个案例研究中,我们通过将 PRAM 应用于小鼠造血 RNA-seq 数据集,发现并通过实验验证了新的转录本。我们发现了与相邻基因共享差异表达模式的新转录本 将 PRAM 应用于 30 个人类 ENCODE RNA-seq 数据集,识别出具有表观遗传和 RAMPAGE 特征的未注释转录本,与最近注释的转录本相似。在一个案例研究中,我们通过将 PRAM 应用于小鼠造血 RNA-seq 数据集,发现并通过实验验证了新的转录本。我们发现了与相邻基因共享差异表达模式的新转录本 将 PRAM 应用于 30 个人类 ENCODE RNA-seq 数据集,识别出具有表观遗传和 RAMPAGE 特征的未注释转录本,与最近注释的转录本相似。在一个案例研究中,我们通过将 PRAM 应用于小鼠造血 RNA-seq 数据集,发现并通过实验验证了新的转录本。我们发现了与相邻基因共享差异表达模式的新转录本Pik3cg与人类造血表型有关,我们为人类中这种关系的保守性提供了证据。PRAM 是作为 R/Bioconductor 包实现的。
更新日期:2020-11-02
down
wechat
bug