当前位置: X-MOL 学术Front. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Global Analysis of Transcription Start Sites in the New Ovine Reference Genome (Oar rambouillet v1.0)
Frontiers in Genetics ( IF 3.7 ) Pub Date : 2020-09-09 , DOI: 10.3389/fgene.2020.580580
Mazdak Salavati 1, 2 , Alex Caulton 3, 4 , Richard Clark 5 , Iveta Gazova 1, 6 , Timothy P L Smith 7 , Kim C Worley 8 , Noelle E Cockett 9 , Alan L Archibald 1 , Shannon M Clarke 3 , Brenda M Murdoch 10 , Emily L Clark 1, 2
Affiliation  

The overall aim of the Ovine FAANG project is to provide a comprehensive annotation of the new highly contiguous sheep reference genome sequence (Oar rambouillet v1.0). Mapping of transcription start sites (TSS) is a key first step in understanding transcript regulation and diversity. Using 56 tissue samples collected from the reference ewe Benz2616, we have performed a global analysis of TSS and TSS-Enhancer clusters using Cap Analysis Gene Expression (CAGE) sequencing. CAGE measures RNA expression by 5′ cap-trapping and has been specifically designed to allow the characterization of TSS within promoters to single-nucleotide resolution. We have adapted an analysis pipeline that uses TagDust2 for clean-up and trimming, Bowtie2 for mapping, CAGEfightR for clustering, and the Integrative Genomics Viewer (IGV) for visualization. Mapping of CAGE tags indicated that the expression levels of CAGE tag clusters varied across tissues. Expression profiles across tissues were validated using corresponding polyA+ mRNA-Seq data from the same samples. After removal of CAGE tags with <10 read counts, 39.3% of TSS overlapped with 5′ ends of 31,113 transcripts that had been previously annotated by NCBI (out of a total of 56,308 from the NCBI annotation). For 25,195 of the transcripts, previously annotated by NCBI, no TSS meeting stringent criteria were identified. A further 14.7% of TSS mapped to within 50 bp of annotated promoter regions. Intersecting these predicted TSS regions with annotated promoter regions (±50 bp) revealed 46% of the predicted TSS were “novel” and previously un-annotated. Using whole-genome bisulfite sequencing data from the same tissues, we were able to determine that a proportion of these “novel” TSS were hypo-methylated (32.2%) indicating that they are likely to be reproducible rather than “noise”. This global analysis of TSS in sheep will significantly enhance the annotation of gene models in the new ovine reference assembly. Our analyses provide one of the highest resolution annotations of transcript regulation and diversity in a livestock species to date.



中文翻译:

新绵羊参考基因组转录起始位点的全局分析 (Oar rambouillet v1.0)

Ovine FAANG 项目的总体目标是提供新的高度连续的绵羊参考基因组序列的全面注释(桨朗布依埃 v1.0)。转录起始位点 (TSS) 的定位是理解转录调控和多样性的关键的第一步。使用从参考母羊 Benz2616 收集的 56 个组织样本,我们使用 Cap 分析基因表达 (CAGE) 测序对 TSS 和 TSS 增强子簇进行了全局分析。CAGE 通过 5' 帽捕获来测量 RNA 表达,并且经过专门设计,可以以单核苷酸分辨率表征启动子内的 TSS。我们采用了一个分析管道,使用 TagDust2 进行清理和修剪,使用 Bowtie2 进行绘图,CAGEfightR 进行聚类,并使用 Integrative Genomics Viewer (IGV) 进行可视化。CAGE 标签的作图表明 CAGE 标签簇的表达水平在不同组织中存在差异。使用来自相同样品的相应的polyA+ mRNA-Seq数据验证跨组织的表达谱。删除读取计数 <10 个的 CAGE 标签后,39.3% 的 TSS 与 NCBI 先前注释的 31,113 个转录本的 5' 端重叠(NCBI 注释的总共 56,308 个转录本)。对于 NCBI 之前注释过的 25,195 份转录本,没有发现符合严格标准的 TSS。另外 14.7% 的 TSS 映射到带注释的启动子区域的 50 bp 范围内。将这些预测的 TSS 区域与注释的启动子区域 (±50 bp) 相交,发现 46% 的预测 TSS 是“新颖的”且之前未注释。使用来自相同组织的全基因组亚硫酸氢盐测序数据,我们能够确定这些“新型”TSS 的一部分是低甲基化的(32.2%),这表明它们可能是可重复的而不是“噪音”。这种对绵羊 TSS 的全局分析将显着增强新绵羊参考组件中基因模型的注释。我们的分析提供了迄今为止牲畜物种转录调控和多样性的最高分辨率注释之一。

更新日期:2020-10-28
down
wechat
bug