当前位置: X-MOL 学术Genome Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation
Genome Biology ( IF 12.3 ) Pub Date : 2021-10-14 , DOI: 10.1186/s13059-021-02502-z
Ankeeta Shah 1 , Briana E Mittleman 1 , Yoav Gilad 2, 3 , Yang I Li 2, 3
Affiliation  

Alternative cleavage and polyadenylation (APA), an RNA processing event, occurs in over 70% of human protein-coding genes. APA results in mRNA transcripts with distinct 3′ ends. Most APA occurs within 3′ UTRs, which harbor regulatory elements that can impact mRNA stability, translation, and localization. APA can be profiled using a number of established computational tools that infer polyadenylation sites from standard, short-read RNA-seq datasets. Here, we benchmarked a number of such tools—TAPAS, QAPA, DaPars2, GETUTR, and APATrap— against 3′-Seq, a specialized RNA-seq protocol that enriches for reads at the 3′ ends of genes, and Iso-Seq, a Pacific Biosciences (PacBio) single-molecule full-length RNA-seq method in their ability to identify polyadenylation sites and quantify polyadenylation site usage. We demonstrate that 3′-Seq and Iso-Seq are able to identify and quantify the usage of polyadenylation sites more reliably than computational tools that take short-read RNA-seq as input. However, we find that running one such tool, QAPA, with a set of polyadenylation site annotations derived from small quantities of 3′-Seq or Iso-Seq can reliably quantify variation in APA across conditions, such asacross genotypes, as demonstrated by the successful mapping of alternative polyadenylation quantitative trait loci (apaQTL). We envisage that our analyses will shed light on the advantages of studying APA with more specialized sequencing protocols, such as 3′-Seq or Iso-Seq, and the limitations of studying APA with short-read RNA-seq. We provide a computational pipeline to aid in the identification of polyadenylation sites and quantification of polyadenylation site usages using Iso-Seq data as input.

中文翻译:

促进替代多聚腺苷酸化研究的基准测序方法和工具

选择性切割和多聚腺苷酸化 (APA) 是一种 RNA 加工事件,发生在 70% 以上的人类蛋白质编码基因中。APA 产生具有不同 3' 末端的 mRNA 转录物。大多数 APA 发生在 3' UTR 内,其中包含可以影响 mRNA 稳定性、翻译和定位的调节元件。可以使用许多已建立的计算工具来分析 APA,这些工具从标准的短读 RNA-seq 数据集中推断多聚腺苷酸化位点。在这里,我们将许多此类工具(TAPAS、QAPA、DaPars2、GETUTR 和 APATrap)与 3'-Seq(一种专门的 RNA-seq 协议,可丰富基因 3' 端的读取)和 Iso-Seq 进行基准测试,太平洋生物科学 (PacBio) 单分子全长 RNA-seq 方法在识别多腺苷酸化位点和量化多腺苷酸化位点使用方面的能力。我们证明 3'-Seq 和 Iso-Seq 能够比以短读 RNA-seq 作为输入的计算工具更可靠地识别和量化多腺苷酸化位点的使用。然而,我们发现运行一个这样的工具 QAPA,它具有一组源自少量 3'-Seq 或 Iso-Seq 的多聚腺苷酸化位点注释,可以可靠地量化 APA 跨条件的变化,例如跨基因型,如成功所证明的那样替代多聚腺苷酸化数量性状基因座(apaQTL)的作图。我们设想我们的分析将阐明使用更专业的测序协议(如 3'-Seq 或 Iso-Seq)研究 APA 的优势,以及使用短读长 RNA-seq 研究 APA 的局限性。
更新日期:2021-10-14
down
wechat
bug