Skip to main content
Log in

Cap analysis of gene expression (CAGE) and noncoding regulatory elements

  • Review
  • Published:
Seminars in Immunopathology Aims and scope Submit manuscript

Abstract

Cap analysis of gene expression (CAGE) was developed to detect the 5′ end of RNA. Trapping of the RNA 5′-cap structure enables the enrichment and selective sequencing of complete transcripts. Upscaled high-throughput versions of CAGE have enabled the genome-wide identification of transcription start sites, including transcriptionally active promoters and enhancers. CAGE sequencing can be exploited to draw comprehensive maps of active genomic regulatory elements in a cell type- and activation-specific manner. The cells of the immune system are among the best candidates to be analyzed in humans, since they are easily accessible. In this review, we discuss how CAGE data are instrumental for integrative analyses with quantitative trait loci and omics data, and their usefulness in the mechanistic interpretation of the effects of genetic variations over the entire human genome. Integrating CAGE data with the currently available omics information will contribute to better understanding of the genome-wide association study variants that lie outside of annotated genes, deepening our knowledge on human diseases, and enabling the targeted design of more specific therapeutic interventions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Noe Gonzalez M et al (2018) CTD-dependent and -independent mechanisms govern co-transcriptional capping of Pol II transcripts. Nat Commun 9(1):3392

    Article  PubMed  PubMed Central  Google Scholar 

  2. Proudfoot NJ, Furger A, Dye MJ (2002) Integrating mRNA processing with transcription. Cell 108(4):501–512

    Article  CAS  PubMed  Google Scholar 

  3. Martinez-Rucobo FW et al (2015) Molecular basis of transcription-coupled pre-mRNA capping. Mol Cell 58(6):1079–1089

    Article  CAS  PubMed  Google Scholar 

  4. Ramanathan A, Robb GB, Chan SH (2016) mRNA capping: biological functions and applications. Nucleic Acids Res 44(16):7511–7526

    Article  PubMed  PubMed Central  Google Scholar 

  5. Furuichi Y, Miura K (1975) A blocked structure at the 5′ terminus of mRNA from cytoplasmic polyhedrosis virus. Nature 253(5490):374–375

    Article  CAS  PubMed  Google Scholar 

  6. Shatkin AJ, Manley JL (2000) The ends of the affair: capping and polyadenylation. Nat Struct Biol 7(10):838–842

    Article  CAS  PubMed  Google Scholar 

  7. Edery I et al (1995) An efficient strategy to isolate full-length cDNAs based on an mRNA cap retention procedure (CAPture). Mol Cell Biol 15(6):3363–3371

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Carninci P et al (1996) High-efficiency full-length cDNA cloning by biotinylated CAP trapper. Genomics 37(3):327–336

    Article  CAS  PubMed  Google Scholar 

  9. Carninci P et al (1997) High efficiency selection of full-length cDNA by improved biotinylated cap trapper. DNA Res 4(1):61–66

    Article  CAS  PubMed  Google Scholar 

  10. Kawai J et al (2001) Functional annotation of a full-length mouse cDNA collection. Nature 409(6821):685–690

    Article  PubMed  Google Scholar 

  11. Okazaki Y et al (2002) Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420(6915):563–573

    Article  PubMed  Google Scholar 

  12. Consortium, F., et al (2014) A promoter-level mammalian expression atlas. Nature 507(7493):462–470

    Article  Google Scholar 

  13. Kodzius R et al (2006) CAGE: cap analysis of gene expression. Nat Methods 3(3):211–222

    Article  CAS  PubMed  Google Scholar 

  14. Shiraki T et al (2003) Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc Natl Acad Sci U S A 100(26):15776–15781

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Carninci P et al (2006) Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet 38(6):626–635

    Article  CAS  PubMed  Google Scholar 

  16. Consortium, F., et al (2009) The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line. Nat Genet 41(5):553–562

    Article  Google Scholar 

  17. Kanamori-Katayama M et al (2011) Unamplified cap analysis of gene expression on a single-molecule sequencer. Genome Res 21(7):1150–1159

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Murata M et al (2014) Detecting expressed genes using CAGE. Methods Mol Biol 1164:67–85

    Article  PubMed  Google Scholar 

  19. Ong CT, Corces VG (2011) Enhancer function: new insights into the regulation of tissue-specific gene expression. Nat Rev Genet 12(4):283–293

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Murakawa Y et al (2016) Enhanced identification of transcriptional enhancers provides mechanistic insights into diseases. Trends Genet 32(2):76–88

    Article  CAS  PubMed  Google Scholar 

  21. Kim TK et al (2010) Widespread transcription at neuronal activity-regulated enhancers. Nature 465(7295):182–187

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Calo E, Wysocka J (2013) Modification of enhancer chromatin: what, how, and why? Mol Cell 49(5):825–837

    Article  CAS  PubMed  Google Scholar 

  23. Li W, Notani D, Rosenfeld MG (2016) Enhancers as non-coding RNA transcription units: recent insights and future perspectives. Nat Rev Genet 17(4):207–223

    Article  CAS  PubMed  Google Scholar 

  24. Sartorelli V, Lauberth SM (2020) Enhancer RNAs are an important regulatory layer of the epigenome. Nat Struct Mol Biol 27(6):521–528

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Shlyueva D, Stampfel G, Stark A (2014) Transcriptional enhancers: from properties to genome-wide predictions. Nat Rev Genet 15(4):272–286

    Article  CAS  PubMed  Google Scholar 

  26. Zhou VW, Goren A, Bernstein BE (2011) Charting histone modifications and the functional organization of mammalian genomes. Nat Rev Genet 12(1):7–18

    Article  PubMed  Google Scholar 

  27. Andersson R et al (2014) An atlas of active enhancers across human cell types and tissues. Nature 507(7493):455–461

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Arner E et al (2015) Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells. Science 347(6225):1010–1014

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Tippens ND et al (2020) Transcription imparts architecture, function and logic to enhancer units. Nat Genet 52(10):1067–1075

    Article  PubMed  PubMed Central  Google Scholar 

  30. Minnoye, L et al (2021) Chromatin accessibility profiling methods. Nat Rev Meth Primers 1:10. https://doi.org/10.1038/s43586-020-00008-9

  31. Kristjansdottir K et al (2020) Population-scale study of eRNA transcription reveals bipartite functional enhancer architecture. Nat Commun 11(1):5963

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Kilchert C, Wittmann S, Vasiljeva L (2016) The regulation and functions of the nuclear RNA exosome complex. Nat Rev Mol Cell Biol 17(4):227–239

    Article  CAS  PubMed  Google Scholar 

  33. Andersson R, Sandelin A (2020) Determinants of enhancer and promoter activities of regulatory elements. Nat Rev Genet 21(2):71–87

    Article  CAS  PubMed  Google Scholar 

  34. Schwalb B et al (2016) TT-seq maps the human transient transcriptome. Science 352(6290):1225–1228

    Article  CAS  PubMed  Google Scholar 

  35. Hirabayashi S et al (2019) NET-CAGE characterizes the dynamics and topology of human transcribed cis-regulatory elements. Nat Genet 51(9):1369–1379

    Article  CAS  PubMed  Google Scholar 

  36. Churchman LS, Weissman JS (2011) Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature 469(7330):368–373

    Article  CAS  PubMed  Google Scholar 

  37. Core LJ, Waterfall JJ, Lis JT (2008) Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322(5909):1845–1848

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Chu T et al (2018) Chromatin run-on and sequencing maps the transcriptional regulatory landscape of glioblastoma multiforme. Nat Genet 50(11):1553–1564

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Core LJ et al (2014) Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat Genet 46(12):1311–1320

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Henriques T et al (2018) Widespread transcriptional pausing and elongation control at enhancers. Genes Dev 32(1):26–41

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Kwak H et al (2013) Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science 339(6122):950–953

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Mayer A et al (2015) Native elongating transcript sequencing reveals human transcriptional activity at nucleotide resolution. Cell 161(3):541–554

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Nojima T et al (2015) Mammalian NET-Seq reveals genome-wide nascent transcription coupled to RNA processing. Cell 161(3):526–540

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Tome JM, Tippens ND, Lis JT (2018) Single-molecule nascent RNA sequencing identifies regulatory domain architecture at promoters and enhancers. Nat Genet 50(11):1533–1541

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Wissink EM et al (2019) Nascent RNA analyses: tracking transcription and its regulation. Nat Rev Genet 20(12):705–723

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Maruyama K, Sugano S (1994) Oligo-capping: a simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotides. Gene 138(1–2):171–174

    CAS  PubMed  Google Scholar 

  47. Zhu YY et al (2001) Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction. Biotechniques 30(4):892–897

    Article  CAS  PubMed  Google Scholar 

  48. Hagemann-Jensen M et al (2020) Single-cell RNA counting at allele and isoform resolution using Smart-seq3. Nat Biotechnol 38(6):708–714

    Article  CAS  PubMed  Google Scholar 

  49. Klein AM et al (2015) Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161(5):1187–1201

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Macosko EZ et al (2015) Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161(5):1202–1214

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Picelli S et al (2014) Full-length RNA-seq from single cells using Smart-seq2. Nat Protoc 9(1):171–181

    Article  CAS  PubMed  Google Scholar 

  52. Ramskold D et al (2012) Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat Biotechnol 30(8):777–782

    Article  PubMed  PubMed Central  Google Scholar 

  53. Salimullah M et al (2011) NanoCAGE: a high-resolution technique to discover and interrogate cell transcriptomes. Cold Spring Harb Protoc 2011(1):pdb prot5559

    Article  PubMed  PubMed Central  Google Scholar 

  54. Adiconis X et al (2018) Comprehensive comparative analysis of 5′-end RNA-sequencing methods. Nat Methods 15(7):505–511

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Cain CE et al (2011) Gene expression differences among primates are associated with changes in a histone epigenetic modification. Genetics 187(4):1225–1234

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Santos-Rosa H et al (2002) Active genes are tri-methylated at K4 of histone H3. Nature 419(6905):407–411

    Article  CAS  PubMed  Google Scholar 

  57. Hon CC et al (2017) An atlas of human long non-coding RNAs with accurate 5′ ends. Nature 543(7644):199–204

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Plessy C et al (2010) Linking promoters to functional transcripts in small samples with nanoCAGE and CAGEscan. Nat Methods 7(7):528–534

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Kouno T et al (2019) C1 CAGE detects transcription start sites and enhancer activity at single-cell resolution. Nat Commun 10(1):360

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Steinhaus R et al (2020) Pervasive and CpG-dependent promoter-like characteristics of transcribed enhancers. Nucleic Acids Res 48(10):5306–5317

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Baillie JK et al (2017) Analysis of the human monocyte-derived macrophage transcriptome and response to lipopolysaccharide provides new insights into genetic aetiology of inflammatory bowel disease. PLoS Genet 13(3):e1006641

    Article  PubMed  PubMed Central  Google Scholar 

  62. Claussnitzer M et al (2020) A brief history of human disease genetics. Nature 577(7789):179–189

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Dimas AS et al (2009) Common regulatory variation impacts gene expression in a cell type-dependent manner. Science 325(5945):1246–1250

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Javierre BM et al (2016) Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167(5):1369-1384 e19

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Chandra V et al (2021) Promoter-interacting expression quantitative trait loci are enriched for functional genetic variants. Nat Genet 53(1):110–119

    Article  CAS  PubMed  Google Scholar 

  66. Witte S et al (2015) High-density P300 enhancers control cell state transitions. BMC Genomics 16:903

    Article  PubMed  PubMed Central  Google Scholar 

  67. Ishigaki K et al (2017) Polygenic burdens on cell-specific pathways underlie the risk of rheumatoid arthritis. Nat Genet 49(7):1120–1125

    Article  CAS  PubMed  Google Scholar 

  68. Raj T et al (2014) Polarization of the effects of autoimmune and neurodegenerative risk alleles in leukocytes. Science 344(6183):519–523

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Consortium, E.P. (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74

    Article  Google Scholar 

  70. Djebali S et al (2012) Landscape of transcription in human cells. Nature 489(7414):101–108

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Consortium, E.P., et al (2020) Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583(7818):699–710

    Article  Google Scholar 

  72. Gorkin DU et al (2020) An atlas of dynamic chromatin landscapes in mouse fetal development. Nature 583(7818):744–751

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Grubert F et al (2020) Landscape of cohesin-mediated chromatin loops in the human genome. Nature 583(7818):737–743

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. He P et al (2020) The changing mouse embryo transcriptome at whole tissue and single-cell resolution. Nature 583(7818):760–767

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. He Y et al (2020) Spatiotemporal DNA methylome dynamics of the developing mouse fetus. Nature 583(7818):752–759

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Partridge EC et al (2020) Occupancy maps of 208 chromatin-associated proteins in one human cell type. Nature 583(7818):720–728

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Van Nostrand EL et al (2020) A large-scale binding and functional map of human RNA-binding proteins. Nature 583(7818):711–719

    Article  PubMed  PubMed Central  Google Scholar 

  78. Vierstra J et al (2020) Global reference mapping of human transcription factor footprints. Nature 583(7818):729–736

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Batut P et al (2013) High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression. Genome Res 23(1):169–180

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Boley N et al (2014) Genome-guided transcript assembly by integrative analysis of RNA sequence data. Nat Biotechnol 32(4):341–346

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Harrow J et al (2012) GENCODE: the reference human genome annotation for the ENCODE Project. Genome Res 22(9):1760–1774

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Derrien T et al (2012) The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 22(9):1775–1789

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Iyer MK et al (2015) The landscape of long noncoding RNAs in the human transcriptome. Nat Genet 47(3):199–208

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Lagarde J et al (2017) High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing. Nat Genet 49(12):1731–1740

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Sun YH et al (2021) Single-molecule long-read sequencing reveals a conserved intact long RNA profile in sperm. Nat Commun 12(1):1361

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Grapotte M et al (2021) Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network. Nat Commun 12(1):3297

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Manolio TA et al (2009) Finding the missing heritability of complex diseases. Nature 461(7265):747–753

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Tam V et al (2019) Benefits and limitations of genome-wide association studies. Nat Rev Genet 20(8):467–484

    Article  CAS  PubMed  Google Scholar 

  89. Yang J et al (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42(7):565–569

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Hirschhorn JN (2009) Genomewide association studies—illuminating biologic pathways. N Engl J Med 360(17):1699–1701

    Article  CAS  PubMed  Google Scholar 

  91. Visscher PM et al (2017) 10 years of GWAS discovery: biology, function, and translation. Am J Hum Genet 101(1):5–22

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Wainberg M et al (2019) Opportunities and challenges for transcriptome-wide association studies. Nat Genet 51(4):592–599

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Garieri M et al (2017) The effect of genetic variation on promoter usage and enhancer activity. Nat Commun 8(1):1358

    Article  PubMed  PubMed Central  Google Scholar 

  94. Banovich, N.E., et al., Methylation QTLs are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels. PLoS Genet, 2014. 10(9): p. e1004663.

  95. Ye Y et al (2020) A multi-omics perspective of quantitative trait loci in precision medicine. Trends Genet 36(5):318–336

    Article  CAS  PubMed  Google Scholar 

  96. Kumasaka N, Knights AJ, Gaffney DJ (2016) Fine-mapping cellular QTLs with RASQUAL and ATAC-seq. Nat Genet 48(2):206–213

    Article  CAS  PubMed  Google Scholar 

  97. Wang X, Goldstein DB (2020) Enhancer domains predict gene pathogenicity and inform gene discovery in complex disease. Am J Hum Genet 106(2):215–233

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Nasser J et al (2021) Genome-wide enhancer maps link risk variants to disease genes. Nat 593(7858):238–243

  99. Maurano MT et al (2012) Systematic localization of common disease-associated variation in regulatory DNA. Science 337(6099):1190–1195

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Castillejo-Lopez C et al (2019) Detailed functional characterization of a waist-hip ratio locus in 7p15.2 defines an enhancer controlling adipocyte differentiation. iScience 20:42–59

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matteo Maurizio Guerrini.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is a contribution to the special issue on: Genetics and functional genetics of Autoimmune diseases - Guest Editors: Yukinori Okada & Kazuhiko Yamamoto

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guerrini, M.M., Oguchi, A., Suzuki, A. et al. Cap analysis of gene expression (CAGE) and noncoding regulatory elements. Semin Immunopathol 44, 127–136 (2022). https://doi.org/10.1007/s00281-021-00886-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00281-021-00886-5

Keywords

Navigation