Pan-cancer pervasive upregulation of 3′ UTR splicing drives tumourigenesis

Chan, Jia Jia; Zhang, Bin; Chew, Xiao Hong; Salhi, Adil; Kwok, Zhi Hao; Lim, Chun You; Desi, Ng; Subramaniam, Nagavidya; Siemens, Angela; Kinanti, Tyas; Ong, Shane; Sanchez-Mejias, Avencia; Ly, Phuong Thao; An, Omer; Sundar, Raghav; Fan, Xiaonan; Wang, Shi; Siew, Bei En; Lee, Kuok Chung; Chong, Choon Seng; Lieske, Bettina; Cheong, Wai-Kit; Goh, Yufen; Fam, Wee Nih; Ooi, Melissa G.; Koh, Bryan T. H.; Iyer, Shridhar Ganpathi; Ling, Wen Huan; Chen, Jianbin; Yoong, Boon-Koon; Chanwat, Rawisak; Bonney, Glenn Kunnath; Goh, Brian K. P.; Zhai, Weiwei; Fullwood, Melissa J.; Wang, Wilson; Tan, Ker-Kan; Chng, Wee Joo; Dan, Yock Young; Pitt, Jason J.; Roca, Xavier; Guccione, Ernesto; Vardy, Leah A.; Chen, Leilei; Gao, Xin; Chow, Pierce K. H.; Yang, Henry; Tay, Yvonne

doi:10.1038/s41556-022-00913-z

Download PDF

Article
Open access
Published: 26 May 2022

Pan-cancer pervasive upregulation of 3′ UTR splicing drives tumourigenesis

Jia Jia Chan¹^na1,
Bin Zhang^1,2,3^na1,
Xiao Hong Chew¹,
Adil Salhi^2,3,
Zhi Hao Kwok¹^nAff29,
Chun You Lim¹,
Ng Desi¹,
Nagavidya Subramaniam⁴,
Angela Siemens¹^nAff30,
Tyas Kinanti¹,
Shane Ong¹,
Avencia Sanchez-Mejias¹^nAff31,
Phuong Thao Ly⁵,
Omer An¹,
Raghav Sundar^6,7,8,9,
Xiaonan Fan¹,
Shi Wang¹⁰,
Bei En Siew¹¹,
Kuok Chung Lee^11,12,
Choon Seng Chong^11,12,
Bettina Lieske^11,12,
Wai-Kit Cheong^11,12,
Yufen Goh¹,
Wee Nih Fam¹,
Melissa G. Ooi^6,13,
Bryan T. H. Koh¹⁴,
Shridhar Ganpathi Iyer^11,15,
Wen Huan Ling¹⁶,
Jianbin Chen¹⁷,
Boon-Koon Yoong¹⁸,
Rawisak Chanwat¹⁹,
Glenn Kunnath Bonney^11,15,
Brian K. P. Goh²⁰,
Weiwei Zhai^16,21,
Melissa J. Fullwood^1,5,22,
Wilson Wang¹⁴,
Ker-Kan Tan^11,12,
Wee Joo Chng^1,6,13,
Yock Young Dan¹³,
Jason J. Pitt^1,23,
Xavier Roca⁵,
Ernesto Guccione²⁴,
Leah A. Vardy⁴,
Leilei Chen^1,25,
Xin Gao^2,3,26,
Pierce K. H. Chow^16,20,27,
Henry Yang¹ &
…
Yvonne Tay ORCID: orcid.org/0000-0003-4989-9101^1,23,28

Nature Cell Biology volume 24, pages 928–939 (2022)Cite this article

16k Accesses
13 Citations
71 Altmetric
Metrics details

Subjects

Abstract

Most mammalian genes generate messenger RNAs with variable untranslated regions (UTRs) that are important post-transcriptional regulators. In cancer, shortening at 3′ UTR ends via alternative polyadenylation can activate oncogenes. However, internal 3′ UTR splicing remains poorly understood as splicing studies have traditionally focused on protein-coding alterations. Here we systematically map the pan-cancer landscape of 3′ UTR splicing and present this in SpUR (http://www.cbrc.kaust.edu.sa/spur/home/). 3′ UTR splicing is widespread, upregulated in cancers, correlated with poor prognosis and more prevalent in oncogenes. We show that antisense oligonucleotide-mediated inhibition of 3′ UTR splicing efficiently reduces oncogene expression and impedes tumour progression. Notably, CTNNB1 3′ UTR splicing is the most consistently dysregulated event across cancers. We validate its upregulation in hepatocellular carcinoma and colon adenocarcinoma, and show that the spliced 3′ UTR variant is the predominant contributor to its oncogenic functions. Overall, our study highlights the importance of 3′ UTR splicing in cancer and may launch new avenues for RNA-based anti-cancer therapeutics.

RNA splicing dysregulation and the hallmarks of cancer

Article 10 January 2023

Non-canonical functions of spliceosome components in cancer progression

Article Open access 02 February 2023

Understanding aberrant RNA splicing to facilitate cancer diagnosis and therapy

Article 09 December 2019

Main

Messenger RNAs (mRNAs) comprise protein-coding regions flanked by 5′ and 3′ untranslated regions (UTRs) that play important roles in post-transcriptional regulation. UTRs harbour many regulatory sequences and structures, such as AU-rich elements, G-rich elements and microRNA response elements, through which RNA-binding proteins (RBPs) and microRNAs modulate mRNA metabolism¹. This includes processes such as mRNA localization, stability and export, which are tightly controlled to ensure correct gene expression and function under physiological conditions.

Most mammalian genes generate alternative 3′ UTRs via various mechanisms including alternative polyadenylation (APA) and alternative splicing. In addition to key physiological functions, frequent observation of mutations and other structural variations in 3′ UTRs in various disease states suggests that 3′ UTR processing may play critical roles in pathogenesis^2,3. For example, structural variations disrupting the 3′ UTR of PD-L1 led to its overexpression and evasion of anti-tumour immunity². In cancer, research on alternative 3′ UTRs has almost exclusively focused on APA-derived shorter 3′ UTRs^4,5. Critically, studies have shown that 3′ UTR shortening by APA disrupted microRNA binding and associated competing endogenous RNA networks in which transcripts compete for shared microRNAs, resulting in the aberrant expression of key oncogenes and tumour suppressors in cancer^4,6. Additionally, these shortened 3′ UTRs could differentially regulate protein function, localization and protein–protein interactions to confer oncogenic advantages to cancer cells^6,7,8,9.

More than 95% of human genes undergo alternative splicing to dramatically increase transcriptome and proteome diversity. Recent large-scale transcriptomic analyses have revealed a high frequency of aberrant splicing in cancer¹⁰. Although 3′ UTR splicing events (3USPs) are annotated in databases, only a handful have been characterized so far as the vast majority of splicing studies have concentrated on the protein-coding regions of mRNAs¹¹. In this Article, we systematically map the pan-cancer landscape of 3’UTR splicing and investigate its potential impact on oncogene expression and cancer progression.

Results

Global analysis reveals widespread 3′ UTR splicing

Using splice junctions specifically located within 3′ UTRs (Fig. 1a), we identified and quantified 3USPs in 7,917 RNA sequencing (RNA-seq) samples across ten cancer types from The Cancer Genome Atlas (TCGA) and the corresponding tissues from The Genotype-Tissue Expression (GTEx) (Supplementary Table 1). In total, 45,815 and 18,253 3USPs were identified from TCGA tumours and their adjacent normal samples (TCGA-tumour and TCGA-normal), while 68,668 events were identified in healthy tissues from GTEx (Fig. 1b).

**Fig. 1: Transcriptome-wide analysis reveals widespread 3′ UTR splicing.**

For each cancer and tissue type, we defined common 3USPs (c3USPs) as events detected in more than half the samples. This enabled us to obtain a robust list of events for further analysis. Compared with the 3USPs, the total number of identified c3USPs was more consistent across different datasets, cancer and tissue types (Fig. 1b and Supplementary Table 1). Moreover, c3USPs were highly reproducible between the TCGA and GTEx datasets (Extended Data Fig. 1a,b and Supplementary Note), while the majority were ubiquitously detected in different cancers and tissues (Fig. 1c and Extended Data Fig. 1c,d). As both datasets were generated from Illumina short-read sequencing, we could not exclude the possibility that the junction reads were from independent 3′ UTRs¹². To address this, we investigated whether these events were supported by long-read Pacific Biosciences (PacBio) sequencing and found that ~50% of the c3USPs were supported (Fig. 1d). More importantly, compared with the GENCODE annotations and the published TCGASpliceSeq¹³, ~20% of the c3USPs identified from our analysis are unannotated (Extended Data Fig. 1e). Furthermore, we analysed the distances from stop codons to 3′ UTR splice sites and the sequence features of the removed introns, and found that 3′ UTR splicing is unlikely to trigger nonsense-mediated mRNA decay (NMD) and may additionally mitigate Staufen-mediated mRNA decay (SMD) (Fig. 1e,f, Extended Data Fig. 1f–l and Supplementary Note). These data suggest that not all 3USPs have been annotated, and they could be biologically functional.

3′ UTR splicing is upregulated in cancer

To identify spliced 3′ UTRs that were dysregulated in cancers, we compared the splicing levels (SPLs) of each c3USP in tumours with their adjacent normal in each TCGA cancer type and the corresponding GTEx normal tissue (Extended Data Fig. 2a). In total, 671 of 1,490 c3USPs showed significant differences in at least one cancer type (Supplementary Table 2). Eight of the ten cancer types analysed had more significantly upregulated than downregulated c3USPs (Fig. 1g). Intriguingly, significantly more (P < 0.01, hypergeometric test) oncogenes were represented in genes displaying upregulated rather than downregulated 3′ UTR splicing, whereas the difference in tumour suppressors was insignificant (Extended Data Fig. 2b,c).

We found that c3USPs identified in different cancer types are significantly overlapped (P < 1 × 10⁻¹⁰⁰, hypergeometric test; Extended Data Fig. 2d). To further examine common versus tissue-specific c3USPs, we extended our analysis to a haematological malignancy, acute myeloid leukaemia (AML) (34 patients versus 21 healthy controls). Among the 1,431 c3USPs identified from these samples, ~46% overlapped with events identified from the ten solid tumours, a lower proportion compared with the 82% overlap among the solid tumours (Fig. 1h and Extended Data Fig. 2d). However, in line with our observations from the solid tumours, 160 c3USPs were significantly upregulated, while only 46 were downregulated in AML (Fig. 1h).

Next, we investigated the association between the SPLs of each c3USP and overall patient survival (OS) in each solid tumour type. We defined two types of prognosis-associated c3USPs based on clinical outcomes: unfavourable and favourable events, for which higher SPLs were correlated with poorer and better OS, respectively. The ratios of unfavourable and favourable events varied among the different cancer types (Fig. 1i, Extended Data Fig. 2e, Supplementary Table 3 and Supplementary Note). We overlapped them with the significantly dysregulated c3USPs across different cancers and found that the upregulated c3USPs were significantly more unfavourable (Extended Data Fig. 2f). Ninety of these were identified as unfavourable prognostic markers, while only 16 downregulated c3USPs were favourable (Fig. 1i). Taken together, these results suggest that 3′ UTR splicing is preferentially upregulated in cancers and may be linked to cancer outcomes.

Top dysregulated 3USPs across cancers

To gain a pan-cancer overview of each significantly dysregulated c3USP, we first measured the number of tumour samples in which the event was over- and under-spliced in each cancer type (Fig. 2a). By combining ten solid tumour types and AML, we showed that 3′ UTRs were preferentially over-spliced (median 149) than under-spliced (median 10) in tumours across 671 significantly dysregulated c3USPs. Among the ten c3USPs with the highest number of over-spliced tumour samples (Fig. 2b and Supplementary Table 4), the top candidate, CTNNB1 c3USP (3′ SP), was over-spliced in ~40% of tumour samples (2,251/5,577) in 10 of the 11 cancer types analysed (Fig. 2b,c and Extended Data Fig. 2g–i). Notably, CTNNB1 is a well-known oncogene that is also the second most frequently mutated gene after TP53 in hepatocellular carcinoma (HCC) (26% versus 30.8% of TCGA-liver HCC (LIHC) tumour samples). Additionally, c3USPs from other annotated oncogenes, including TCF3 (1,335/5,577) and HRAS (359/5,577), were also highly over-spliced (Fig. 2d).

**Fig. 2: Pan-cancer analysis identifies *CTNNB1* as the top dysregulated c3USP across 11 cancers.**

Targeted inhibition of 3′ UTR splicing impedes HCC carcinogenesis

Among the ten cancer types studied, we found that the TCGA-LIHC samples had significantly higher numbers of 3USPs than their adjacent normal samples, and this was still true after their normalization to the number of all splicing events, including those from 5′ UTRs, 3′ UTRs, coding sequences (CDS) and non-coding RNAs (Fig. 3a and Extended Data Fig. 2j). Critically, a high number of 3USPs, but not the number of all splicing events, was significantly correlated with poorer OS (Fig. 3b and Extended Data Fig. 2k). We further analysed an additional RNA-seq dataset containing 211 samples from the Precision Medicine in Liver Cancer Asia-Pacific Network (PLANet) consortium¹⁴, and an in-house dataset of four paired HCC-adjacent normal samples, which were sequenced to greater depths (Supplementary Table 5). The 3USPs identified from both datasets were highly consistent with those from the TCGA-LIHC data (Fig. 3c and Extended Data Fig. 3a). Moreover, the dysregulation of c3USPs in the TCGA and PLANet datasets was highly correlated (Extended Data Fig. 3b). Among the 31 shared significantly upregulated events, we selected the top 5 candidates not annotated as NMD targets (CTNNB1, CHEK1, MAPK1, THUMPD1 and WDR55) for further experimental validation.

Following the Sanger sequencing validation of the candidate 3′ UTR splice junctions (Fig. 3d), we designed antisense oligonucleotides (ASOs) to block the 3′ UTR splice sites (Supplementary Note). These ASOs significantly reduced their respective 3′ SP expression with a concomitant increase in the unspliced, full-length transcript (3′ FL) for CHEK1, CTNNB1 and THUMPD1, while the CDS transcripts were unaffected in the HCC cell lines, Hep3B and HepG2 (Fig. 3e,f, Extended Data Fig. 3c,d and Supplementary Note). These effects were accompanied by a decrease in the protein expression of the respective genes and the repression of tumour growth, likely due to cell cycle inhibition as evident from the downregulated expression of cell cycle genes, including CCNE1, CDK2, CDK4 and CDK6 (Fig. 3g,h and Extended Data Figs. 3e,f and 4a). This was further confirmed with additional ASOs (Extended Data Fig. 4). Thus, these ASOs could specifically inhibit 3′ UTR splicing, which potentially plays an important role in regulating the protein expression and tumourigenic functions of oncogenes.

3′ UTR splicing of CTNNB1 promotes tumourigenesis

We further examined the most consistently dysregulated c3USP, CTNNB1 3′ SP in HCC and found that only over-splicing of the CTNNB1 3′ UTR, but not its somatic mutational status or total transcript expression, was significantly correlated with poorer OS (Fig. 4a), suggesting that its 3′ UTR splicing could be a robust prognosticator for HCC. 3′ UTR splicing of CTNNB1 generates two 3′ UTR variants, 3′ SP (NM_001098210) and 3′ SP2 (NM_001330729). However, as 3′ SP2 is not significantly upregulated in HCC and has minimal effects on CTNNB1 expression and tumourigenesis, we focused only on 3′ SP for further experimental validation (Fig. 4b and Extended Data Fig. 5a–d). As the CTNNB1 3′ UTR is spliced 11/12 nt downstream of the stop codon, we ruled out splicing-induced NMD by knocking down a key NMD regulator, UPF1, which did not alter CTNNB1 transcript and protein expression (Fig. 4c–e). This was also observed for other 3USP candidates: CHEK1, MAPK1, THUMPD1 and WDR55 (Fig. 4d,e). Consistent with its expression in patient samples, CTNNB1 3′ SP was upregulated in HCC cell lines, Hep3B and SNU398, compared with THLE-2 (Fig. 4f). We also verified that CTNNB1 3′ SP is conserved in mouse and is more highly expressed in the mouse tumour relative to the adjacent normal tissue (Extended Data Fig. 5e). Collectively, these data underscore the functional relevance of the spliced 3′ UTR across different species and in a disease setting.

**Fig. 4: Analysis of the *CTNNB1* spliced 3′ UTR in HCC.**

As CTNNB1 plays critical roles in adherens junction formation and WNT signalling to regulate cell proliferation and migration¹⁵, we first performed gene set enrichment analysis (GSEA) by comparing two groups of the tumour samples: (1) tumour samples with over-spliced CTNNB1 3′ SP and (2) the remaining tumour samples. We showed that over-splicing of the CTNNB1 3′ UTR was significantly associated with the upregulation of WNT signalling and cell cycle genes in both TCGA-LIHC and PLANet tumour samples (Fig. 5a and Extended Data Fig. 6a). To investigate the functional effects of CTNNB1 3′ UTR splicing, we employed a complementary approach to the ASOs using custom-designed short interfering RNAs (siRNAs) that specifically targeted the CDS, the intron of 3′ FL and the unique splice junction of 3′ SP. These siRNAs efficiently and specifically downregulated the expression of their respective transcripts (Fig. 5b, Extended Data Fig. 6b, Supplementary Table 6 and Supplementary Note). Additionally, 3′ SP depletion significantly reduced CTNNB1 CDS transcript and protein expression compared with 3′ FL knockdown (Fig. 5b,c and Extended Data Fig. 6b,c), suggesting that the CTNNB1 protein is primarily expressed from the 3′ SP transcript. Consistent with the effect of the splice site-targeting ASO, we observed si-CDS and si-3′ SP-mediated reduction in cell growth and migration (Fig. 5d and Extended Data Fig. 6d,e), and a lack of tumour growth in mouse xenografts (Fig. 5e). These could partially be due to the diminished expression of WNT target genes, such as AXIN2, MYC and TCF7, as well as cell cycle markers CDK2, CDK4, CDK6 and CCNE1, upon the siRNA- and ASO-mediated downregulation of CTNNB1 3′ SP, in line with the GSEA results (Fig. 5f,g and Extended Data Fig. 6f,g). We further employed the clustered regularly interspaced short palindromic repeats (CRISPR)–Cas9 system to mutate the CTNNB1 3′ UTR splice site (GT > GG) at the genomic level in Hep3B cells (Extended Data Fig. 6h). This led to a significant downregulation of the 3′ SP transcript expression compared with the CDS and 3′ FL transcripts, and a simultaneous decrease in CTNNB1 protein expression and cell proliferation (Fig. 5h,i and Extended Data Fig. 6i). Additionally, we verified these knockdown effects with additional siRNAs (Extended Data Fig. 7) and further tested the same ASOs and siRNAs in the COAD cell line DLD-1 since global 3′ UTR splicing is significantly increased in COAD and associated with poorer OS, while CTNNB1 3′ UTR splicing is also upregulated in COAD (Extended Data Fig. 8a–e). These resulted in similar phenotypic effects to those in the HCC cells (Extended Data Fig. 8f–m). Collectively, these findings suggest that the CTNNB1 3′ SP transcript is the predominantly translated isoform and highlight the critical role that the 3′ SP variant plays in the regulation of CTNNB1 expression and oncogenic function.

**Fig. 5: 3′ UTR splicing of *CTNNB1* promotes tumourigenesis.**

3′ UTR splicing may enhance CTNNB1 expression to promote tumourigenesis

Next, we analysed ENCODE RNA-seq data and identified hundreds of 3USPs that were up- or downregulated upon knockdown of different RBPs (Extended Data Fig. 9a). We integrated this with crosslinking and immunoprecipitation (CLIP)-seq data¹⁶, and selected RBPs with putative binding sites within the terminal exon of CTNNB1 (including both the CDS and unspliced 3′ UTR) and/or whose knockdown resulted in significant changes in CTNNB1 3′ SP for further validation (Fig. 6a and Supplementary Table 7). Only siRNA-mediated knockdown of SF3B1, SRSF1 and U2AF2 consistently downregulated the 3′ SP transcript, concomitantly increased the 3′ FL transcript and reduced CTNNB1 protein expression without affecting the four CDS exon–exon junctions tested (Fig. 6b,c and Extended Data Fig. 9b–e). We also showed that RNA immunoprecipitation (RIP) of SRSF1 and U2AF2 significantly enriched for both CTNNB1 3′ UTR variants, while SF3B1 RIP enriched for only CTNNB1 3′ FL (Fig. 6d). We further verified these associations by pulling down the CTNNB1 transcripts whereby SRSF1 and U2AF2 were enriched by the antisense 3′ FL and 3′ SP probes, and consistent with the RIP results, enrichment of SF3B1 was observed only for the 3′ FL pulldown (Fig. 6e). These observations suggest that these RBPs may associate with the CTNNB1 3′ UTR and modulate its splicing.

To investigate whether the 3′ UTR variants may exhibit varying phenotypic effects due to their differential regulation of CTNNB1 expression, we first assessed the protein expression of CTNNB1 overexpressed from constructs containing CTNNB1 CDS tagged to each 3′ UTR variant (Extended Data Fig. 9f). Despite similar transcript levels, we observed higher CTNNB1 protein expression from the 3′ SP variant (Fig. 6f and Extended Data Fig. 9g). Next, we performed a luciferase reporter assay using reporter constructs containing the different 3′ UTR variants. 3′ SP significantly increased luciferase activity compared with 3′ FL (Fig. 6g). Similar results were also observed in COAD cells (Extended Data Fig. 10a–c). We first tested whether 3′ UTR splicing regulated CTNNB1 expression at the transcript or protein level by inhibiting transcription or translation following the overexpression of HA-tagged CTNNB1 variants. The transcript and protein expression of both 3′ FL and 3′ SP variants were similarly changed (Extended Data Fig. 10d,e), contrary to a previous study that demonstrated a longer mRNA half-life for CTNNB1 3′ SP in HeLa cells, which could be due to tissue-specific regulation¹⁷. We further inhibited proteasomal degradation and did not observe differential CTNNB1 protein stability (Extended Data Fig. 10f).

Next, we performed the translation reporter assay to investigate the effect of the CTNNB1 3′ UTR variants on translation efficiency and observed a significant increase in the 3′ SP luciferase signal compared with that of 3′ FL (Fig. 6h), implying that the 3′ SP variant may be preferentially translated. As this assay relies on exogenously expressed constructs, we also performed polysome profiling to detect translation efficiency of the endogenous CTNNB1 transcript variants. In contrast to the translation reporter assay, polysome profiling for Hep3B and SNU398 cells showed that the 3′ FL and 3′ SP transcripts are similarly distributed across the polysome fractions, suggesting the transcript variants present in the cytoplasm are equally translated (Fig. 6i). This discrepancy could be due to several factors: (1) the luciferase ORF (~1 kb) is much smaller than that of CTNNB1 (~3 kb), which could carry additional components that influence its splicing, folding and/or translation, and (2) the luciferase reporters are exogenously expressed, whereas the polysome profiles measure endogenous levels of CTNNB1 and may be more representative of physiological conditions.

Previous studies have demonstrated nuclear retention of intron-containing transcripts by the U1 small nuclear ribonucleoprotein (snRNP), a component of the RNA spliceosome, to regulate the efficient expression of protein-coding mRNAs^18,19,20. We postulated that the variation in protein expression from the CTNNB1 3′ UTR variants may be attributed to their different transcript localization. To investigate this, we performed U1 RIP. Only the 3′ FL transcript was significantly enriched, while 3′ SP was undetected (Fig. 7a). Consistently, both nuclear–cytoplasmic fractionation and RNA fluorescence in situ hybridization (FISH) data showed that the 3′ SP transcripts were predominantly cytoplasmic, whereas the 3′ FL transcripts were mainly nuclear (Fig. 7b,c), which we also observed for CHEK1 (Extended Data Fig. 10g,h). These findings suggest that nuclear retention of the intron-containing 3′ FL transcripts may contribute to their reduced availability for the cytoplasmic translational machinery, resulting in decreased protein expression.

**Fig. 7: 3′ UTR splicing-mediated cytoplasmic localization may enhance CTNNB1 expression.**

To further interrogate the importance of 3′ UTR splicing for CTNNB1 expression, we mutated the 5′ splice site (5′ SSmut) of the CTNNB1 3′ FL plasmid constructs. Overexpression of CTNNB1 5′ SSmut resulted in CTNNB1 protein levels higher than that of wild-type 3′ FL (3′ FL-WT) and comparable to 3′ SP (Fig. 8a). It also significantly increased luciferase activity compared with 3′ FL-WT in the luciferase reporter assay, but only in SNU398, and not in the translation reporter assay (Fig. 8b,c), suggesting that the 5′ SS mutation does not confer translational advantage. This is supported by the polysome profile of cells treated with splice site-blocking ASOs showing comparable distributions of the 3′ FL and 3′ SP variants across the polysome fractions compared with the control (Fig. 8d). Taken together, these findings indicate that differential cellular localization of the CTNNB1 3′ UTR variants could be the predominant factor impacting CTNNB1 protein expression. CTNNB1 may be primarily translated from the 3′ SP transcripts that are exported to the cytoplasm upon splicing, highlighting the importance of 3′ UTR splicing in driving oncogene expression and cancer progression (Fig. 8e).

**Fig. 8: Upregulated 3′ UTR splicing may promote carcinogenesis.**

Discussion

Multiple studies have shown that aberrant splicing in cancer confers proliferation, migratory and drug resistance advantages to cancer cells^10,21. However, these have mostly focused on splicing events in coding exons as 3′ UTR splicing was often thought to trigger NMD²². Here we build SpUR, a database to comprehensively characterize 3USPs in human cancers and their corresponding normal tissues (http://www.cbrc.kaust.edu.sa/spur/home). We reveal that 3′ UTR splicing is widespread, upregulated in cancer, correlated with poor prognosis and more prevalent in oncogenes. We demonstrate the physiological, functional and clinical relevance of the spliced 3′ UTR of the key oncogene CTNNB1. We show that CTNNB1 is over-spliced in ~40% of tumour samples in ten cancer types, and its spliced 3′ UTR (1) is a more robust prognostic indicator compared with its transcript expression and somatic mutational status in HCC; (2) is not an NMD target; (3) promotes cell proliferation and migration; and (4) enhances protein expression potentially through its cytoplasmic localization (Fig. 8e). Furthermore, these properties extend to 3′ UTR variants of other genes, such as CHEK1. Critically, dysregulated 3USPs may also play key roles in other cancer types, including both solid tumours and haematological malignancies. The low overlap of c3USPs between AML and the solid tumours is noteworthy and may reflect potential intrinsic differences between blood and solid tumours at the genomic level. Further work on other haematological malignancies will provide a better understanding of these variations.

As 3′ UTRs carry regulatory elements critical for modulating RNA metabolism and even protein activity^23,24, deregulated 3′ UTR splicing could have other mechanistic and functional consequences. The loss of 3′ UTR regulatory elements and binding sites through splicing coupled with possible splicing-mediated changes in RNA secondary structures could significantly disrupt molecular interactions, such as those with microRNAs and RBPs, and their regulatory effects. Additionally, we show that 3′ UTRs that undergo splicing contain introns enriched in Alu elements, which are known to facilitate splicing and RNA editing^25,26. Potential crosstalk between these two RNA processing steps that are highly dysregulated in cancer could further disrupt gene expression to drive tumourigenesis^10,21,27.

In recent years, the use of RNA-based therapeutics has been gaining momentum. In particular, many ASOs are undergoing clinical trials for the treatment of various medical conditions. A handful of anti-cancer ASOs, such as Danvatirsen (AZD1950) and Travedersen (OT-101), which target STAT3 and TGF-β2, respectively, have had varying levels of success²⁸. Here we demonstrate the use of ASOs to manipulate 3′ UTR splicing to repress oncogene expression and cancer cell proliferation. Chemically modified ASOs have been successfully delivered via different routes of administration and shown to be active in various tissues, making them an attractive treatment option for different cancers²⁹, Moreover, the specific upregulation of 3′ UTR splicing in cancer suggests that these ASOs could possess therapeutic potential with minimal effects in normal cells, which are beneficial properties for the development of ASO-based anti-cancer drugs.

Multiple studies have identified a large repertoire of RBPs and demonstrated their essential roles in a diverse range of regulatory processes³⁰. A recent ENCODE study has gone a step further to construct their binding and functional maps from multiple eCLIP datasets¹⁶, which could facilitate easier identification of 3′ UTR-specific splicing factors that can be targeted for cancer treatment. For example, SRSF1, which can potentially modulate CTNNB1 3′ UTR splicing, was targeted using decoy RNA oligonucleotides to dampen its activity in a recent study³¹. With the relative simplicity of designing RNA-based therapies using base-pairing complementarity and the rapid advancement in drug delivery strategies, oligonucleotide-based drugs that can efficiently target cancer-specific 3′ UTR splicing, 3′ UTR spliced variants or the splicing factors involved could be a game changer in the field of cancer therapeutics.

3′ UTR splicing could be a widespread mechanism that cancer cells exploit to generate NMD- and SMD-insensitive and intronless transcripts that are effectively exported to the cytoplasm to promote oncogene expression and tumourigenesis. These findings provide key insights into our understanding of this poorly characterized facet of RNA processing and its contribution to transcriptome heterogeneity and carcinogenesis. In particular, we provide evidence that specific targeting of 3′ UTR splicing could effectively attenuate the tumourigenic phenotype of key oncogenes. Furthermore, the upregulation of 3′ UTR splicing in various cancers and its significant correlation with prognosis suggest that its detection and targeting may represent new avenues for the development of more targeted diagnostics and therapeutics.

Methods

This study complies with all relevant ethical regulations. The human studies were approved by the following institutional review boards (IRBs): the Domain Specific Review Board under the National Healthcare Group in Singapore, the Central Institution Review Board (CIRB) of SingHealth, of which all National Cancer Center Singapore, Singapore General Hospital and National University Hospital were constituent members (CIRB Ref: 2016/2626 and 2018/2112), Medical Research Ethics Committee of UMMC (MREC ID number 201713-4729) and Research Committee of National Cancer Institute Thailand (project number 174_2017C_OUT504). Animal protocols were approved by the National University of Singapore (NUS) Institutional Animal Care and Use Committee (IACUC).

Human studies

The protocols for the human studies are approved by the IRBs listed above. Each patient gave informed written consent. The AML study includes 18 AML patients (10 females and 8 males) in the age range of 22–75 years and 15 patients undergoing total knee replacements (11 females and 4 males) in the age range of 47–84 years. The PLANet study includes 46 patients (13 females and 33 males) in the age range of 47–87 years. There is no patient information available for the in-house HCC dataset as all patients in this study have been de-identified.

Identification of 3USPs

RNA-seq data (fastq files) of 7,917 samples from TCGA and GTEx were downloaded from dbGaP repository (2016) and aligned to the reference human genome (hg19) using STAR (v2.5.2a)³². In total, ten TCGA cancer types with more than 30 adjacent normal samples were selected for analysis. The derived splicing junction (splicing-out) reads were filtered and merged to identify all introns. For each defined intron, the splicing-in reads (covering the splice site by at least 6 nt) were counted by featureCounts (v1.6.1) and SPLs were calculated using the following formula:

$${\mathrm{Splicing}}\,{\mathrm{level}}\left( {\mathrm{{SPL}}} \right) = \frac{{{\mathrm{splicing}}-{\mathrm{out}}\,{\mathrm{reads}}}}{{{\mathrm{splicing}}-{\mathrm{out}}\,{\mathrm{reads}}\,\, + \,\,{\mathrm{splicing}}-{\mathrm{in}}\,{\mathrm{reads}}/2}}$$

To identify putative splicing events in 3′ UTRs, each intron was overlapped with annotations from GENCODE (v23) (ref. ³³). Only introns specifically located within 3′ UTRs that did not overlap with any annotated CDS or 5′ UTR were retained. Within these introns, those supported by at least two junction reads and exhibited SPLs >10% in at least one sample, were selected as putative 3USPs. The same approach was applied to identify and quantify the 3USPs in the AML, PLANet and in-house HCC RNA-seq samples. All clinical samples obtained from human research participants were done in accordance with the protocols approved by the relevant IRBs in Singapore.

For each RNA-seq sample, we counted both the total number of 3USPs and all splicing events including those in the 5′ UTR, 3′ UTR, CDS and non-coding RNAs. To exclude the influence of transcriptional activity in quantifying the number of 3USPs, the number of 3USPs in each sample was normalized by dividing it with the number of all splicing events. To account for the variability caused by different sample sizes, sequencing depths and read lengths, we further defined common 3′ UTR splicing events (c3USPs) as the 3USPs exhibiting SPLs >10% in more than half of the samples in each TCGA cancer and GTEx tissue type. This was also applied to the AML and PLANet datasets, but not to the in-house data owing to its small sample size (n = 4).

Analysis of dysregulated 3′ UTR splicing in cancer

To identify differentially spliced 3′ UTRs between tumour and normal samples, we used two approaches to analyse the TCGA, AML, PLANet and in-house datasets owing to the different numbers of samples (5,543 TCGA-tumour samples in 10 cancers, 34 AML, 165 PLANet HCC and 4 pairs of in-house HCC tumour samples). For each TCGA cancer cohort, we compared the SPL of each c3USP between the tumour and adjacent normal samples (TCGA-tumour and TCGA-normal) using the Mann–Whitney U test. The Benjamini–Hochberg method was used for multiple test adjustment (false discovery rate (FDR) <0.1). The same approach was applied to the AML and PLANet HCC samples with the following cut-offs: FDR <0.1 and median SPL difference between tumour and normal >5%. Significant candidates in each TCGA cancer type were filtered on the basis of the criteria: (1) unidirectional median SPL changes between TCGA-tumour and TCGA-normal, and TCGA-tumour and GTEx, and (2) median SPL difference of >5% between TCGA-tumour and TCGA-normal or TCGA-tumour and GTEx. For the in-house dataset, we applied a method optimized for the detection and quantification of splicing differences between tumour and normal samples for a small sample size as previously described³⁴. Significance was determined using permutation-derived FDR (<0.1) and the median SPL difference between tumour and normal (>5%).

To identify tumour samples that exhibit over-splicing for a given significantly dysregulated c3USP in each cancer type, the SPLs in each tumour sample were compared with the SPLs from the corresponding TCGA-normal and GTEx samples. The dashed line in the bottom left panel of Fig. 2a indicates the 90% quantile cut-off of SPLs in the normal samples (TCGA-normal and GTEx samples were analysed separately, and the higher value was used for enhanced stringency). Tumour samples with SPLs higher than the 90% quantile cut-off were considered over-spliced for a given c3USP. Under-splicing was similarly defined as SPLs lower than the 10% quantile of the normal samples.

Survival analysis

SPLs of each c3USP in each cancer type were correlated with patient survival using the Cox proportional hazards regression model. The Kaplan–Meier method was applied by splitting patients into high and low groups according to their SPLs (top and bottom halves ranked by SPL). The same approach was applied to investigate the prognostic effect of general splicing in each cancer type using the total number of 3USPs and all splicing events. For the Kaplan–Meier curves of CTNNB1 in liver cancer, tumour samples in which CTNNB1 was mutated, overexpressed or the 3′ UTR of CTNNB1 was over-spliced were compared with the samples in which these phenomena are absent, respectively. Therein, overexpression and over-splicing were defined by transcript expression or SPLs greater than 90% quantile of that in the normal samples as described in the section ‘Analysis of dysregulated 3′ UTR splicing in cancer’. The mutant CTNNB1 samples included only missense mutations while the wild-type samples did not contain any genetic alterations for CTNNB1.

Processed public datasets

The processed GFF files derived from the PacBio long-read sequencing data from liver, heart, brain and the MCF7 cancer cell line were downloaded from the PacBio IsoSeq Human Tissue and MCF7 datasets (http://datasets.pacb.com.s3.amazonaws.com/2014/). Owing to the shallow sequencing depth, we combined all the identified transcripts from these four cell types. The 3USPs that overlapped with the PacBio-identified isoforms with identical 5′ and 3′ splice sites were considered as being supported by PacBio. The genomic coordinates of repeat elements were downloaded from the UCSC genome browser³⁵ and overlapped with introns in 3′ UTRs using BEDTools (v2.29) (ref. ³⁶). The putative binding sites of RBPs were downloaded from the POSTAR database³⁷, and significant peaks identified from PAR-CLIP, HITS-CLP, iCLIP and eCLIP were merged. RBPs related to splicing were selected on the basis of merging the annotations from GO terms and KEGG pathways as described in a previous study³⁸. In total, 519 HepG2 RNA-seq samples from ENCODE, including short-hairpin-RNA-mediated knockdown of 227 RBPs and 51 control samples, were analysed. For each RBP, we compared the SPL of CTNNB1 3′ SP between the knockdown and control samples using the Mann–Whitney U test.

Annotations of oncogenes and tumour suppressors were derived by combining the resources from CancerMine³⁹ and OncoKB (Precision Oncology Knowledge Base)⁴⁰. In total, 889 oncogenes and 878 tumour suppressors were obtained. Next, they were overlapped with genes that contained significantly upregulated and downregulated c3USPs (FDR <0.1) and the hypergeometric test was used to measure the significance of the overlap.

Reagents

Reagents are as follows: antibody reagents (Supplementary Table 8); TRIzol, Lipofectamine 3000, Lipofectamine RNAiMAX, Dulbecco’s modified Eagle medium (DMEM), Roswell Park Memorial Institute 1640 medium (RPMI), Opti-MEM reduced serum medium, foetal bovine serum (FBS), 10× transcription buffer, NTPs (Thermo Fisher); Dharmafect 1, siGENOME and On-targetPLUS siRNA reagents (Dharmacon) (Supplementary Table 8); ASOs, Alt-R S.p. HiFi Cas9 Nuclease V3, single guide RNA, homology-directed repair template (Integrated DNA Technologies) (Supplementary Table 8); pcDNA3.1+ vector (Addgene); psiCHECK-2 vector (Promega).

Plasmids and mutagenesis

The CTNNB1 3′ UTR variants were cloned into psiCHECK-2, and the HA tag, CTNNB1 CDS and 3′ UTR variants were cloned into pcDNA3.1+ using the primers and restriction sites listed in Supplementary Table 9. Restriction sites or linkers between the CDS and 3′ UTRs were removed using the Quikchange Lightning Multi Site-Directed Mutagenesis Kit (Agilent) as per the manufacturer’s protocol. All constructs were verified by Sanger sequencing.

Cell culture, transfection and treatments

Human HCC cell lines Hep3B (ATCC: HB-8064) and HepG2 (ATCC: HB-8065) were cultured in DMEM and SNU398 (ATCC: CRL-2233) in RPMI. The colon adenocarcinoma (COAD) cell line DLD-1 (Horizon Discovery: HD PAR-086) was cultured in RPMI. Both DMEM and RPMI were supplemented with 10% FBS, penicillin–streptomycin and glutamine. Hep3B and DLD-1 cells express wild-type CTNNB1, while HepG2, SNU398 and HCT116 cells express constitutively active CTNNB1 mutants^41,42. The normal liver cell line, THLE-2 (ATCC: CRL-2706), was grown in BEGM Bronchial Epithelial Cell Growth Basal Medium (Lonza) supplemented with 10% FBS, 5 ng ml⁻¹ human epidermal growth factor, 70 ng ml⁻¹ phosphoethanolamine and the additives from the BEGM Bronchial Epithelial Cell Growth Medium BulletKit (except gentamycin/amphotericin and epinephrine). The normal colon cell line CCD 841 CoN and colorectal carcinoma cell line HCT116 were cultured in DMEM as above. The cells were maintained at 37 °C in a humidified atmosphere with 5% CO₂. For knockdown experiments, 150,000 cells were transfected with 50 nM of each siRNA per well in 12-well plates using Dharmafect 1 following the manufacturer’s instructions. For overexpression and ASO experiments, cells were seeded at 120,000 cells per well in 12-well plates 24 h before transfecting 500–1,000 ng of each plasmid with Lipofectamine 3000 or 100 nM of each ASO with Lipofectamine RNAiMAX as per the manufacturer’s protocol. For treatments, 5 μm of actinomycin D (Sigma), 355 nM of cycloheximide (Sigma) or 20 μm of MG132 (Santa Cruz) was added to cells 48 h post-transfection. Post-treatment, the cells were collected at the specified timepoints for downstream analysis.

CRISPR–Cas9 gene editing

For CRISPR–Cas9 gene editing experiments, cells were seeded in 12-well plates 24 h before transfecting 13.6 nM of ribonucleoprotein complexes (consisting of Cas9 and single guide RNA) and 7.8 nM of homology-directed repair template with Lipofectamine CRISPRMAX following the manufacturer’s instructions. Cells were collected 48 h post-transfection and genomic DNA was extracted using the DNeasy Blood & Tissue Kit (Qiagen) as per the manufacturer’s protocol for subsequent PCR and Sanger sequencing validation. Soft agar assays and RNA and protein extractions were also performed as described below.

Soft agar assay

Cells were transfected as described above 18–24 h before seeding. On the day of seeding, a 0.6% base agarose was prepared in 12-well plates. Transfected cells were trypsinized, resuspended and counted. A seeding density of 7,000 cells per well was used for Hep3B and SNU398, and 6,000 cells per well for HepG2. The cells were mixed in their respective growth medium and agarose to a final agarose concentration of 0.3% and added to the prepared base. Once the agarose solidified, 0.5 ml of growth medium was added to each well. The cells were maintained at 37 °C in a humidified atmosphere with 5% CO₂ and the growth medium was changed every 2 days. The colonies were imaged after 10–14 days under 4× magnification using the Olympus IX71 microscope and quantified using ImageJ (v1.51j8).

Migration assay

Cell migration assays were performed with Hep3B and SNU398 cells in wound-healing culture-insert dishes (Ibidi) and transwell chambers (Corning). The cells were transfected with siRNAs as described above and collected 48 h post-transfection. For the wound healing assay, 70,000 cells were seeded on each side of the chamber. Twenty-four hours after seeding, the insert was removed and the cells were washed three times with their corresponding medium. At the specified timepoints, imaging was performed and the percentage wound closure was measured using the CellSens software (v1.15). For the transwell experiment, 200,000 cells were seeded on the upper transwell chamber in serum-free medium as described⁴³. The seeded cells were cultured for another 48 h followed by fixation, staining and imaging.

Xenograft

Hep3B cells were transfected in six-well plates. Forty-eight hours post-transfection, cells were collected, washed and counted. Two million cells per injection were prepared by mixing the cell suspension with Matrigel Matrix (Corning) in a 1:1 ratio. The cell mixture was injected subcutaneously into the lower flank on each side of five (per condition) 4- to 6-week-old, female, CrTac:NCr-Foxn1<nu> (NCr nude) mice (Invivos). Tumour sizes were measured every 3 days. The mice were killed after 35 days and the tumours were excised, weighed and measured. All mouse work was performed in accordance with the NUS IACUC guidelines under the protocol number R19-0852. The maximum tumour volume permitted is 2,000 mm³, which was not exceeded in all our xenograft experiments. The mice were housed in the following conditions: 23–24 °C, 44–58% humidity, 12 h/12 h dark/light cycle (19:00–7:00/7:00–19:00).

Luciferase and translation reporter assays

Cells were seeded at 50,000 cells per well in 24-well plates a day before transfection. Then, 25 ng of psiCHECK-2 plasmids were transfected per well using Lipofectamine 3000 as described above. The transfected cells were washed in PBS and lysed and luminescence was measured 72 h post-transfection following the manufacturer’s protocol for the dual luciferase reporter assay kit (Promega).

For the translation reporter assay, PCR was performed using forward primers with T7 promoter and Kozak sequences, reverse primers with polyT (Supplementary Table 10) and psiCHECK-2 plasmids as templates. In vitro transcription was performed using 1 μg of purified PCR product, 1× transcription buffer, 4 mM NTP mix, 8 mM 3′-O-Me-m7G(5′)ppp(5′)G RNA Cap Structure Analog (New England Biolabs) and 200 U T7 RNA polymerase (Ambion), and incubated for 4 h at 37 °C. The transcription products were purified by ethanol precipitation followed by the Microspin G-50 columns (GE Healthcare). Cells were seeded as described above. Then, 20 ng of the Firefly luciferase control was co-transfected using Lipofectamine 3000 with the corresponding amount of Renilla luciferase control or 3′ UTR reporter (calculated on the basis of 5 × 10¹¹ copies per reporter). Luciferase activity was measured 48–72 h post-transfection as described above.

RNA extraction and RT–qPCR

Total RNA was extracted using TRIzol followed by column purification using the PureLink RNA Mini Kit (Thermo Fisher) with on-column DNase I treatment (Thermo Fisher). The High Capacity cDNA Reverse Transcription Kit (Thermo Fisher) was used to generate complementary DNA. Subsequently, real-time quantitative PCR (RT–qPCR) was performed using the PowerUp SYBR Green Master Mix (Applied Biosystems) on the QuantStudio 5 RT–PCR system (Applied Biosystems). PCR experiments were performed using EconoTaq PLUS GREEN 2× Master Mix (Lucigen) following the manufacturer’s instructions. qPCR and PCR primers are listed in Supplementary Table 10.

Protein extraction and western blot analysis

Cells were collected and lysed as previously described⁴⁴. For western blot analysis, 10–15 μg of lysates were fractionated using 8% SDS–PAGE gels in running buffer (25 mM Tris, 192 mM glycine and 0.1% SDS) and transferred to PVDF membranes (Thermo Fisher) in transfer buffer (25 mM Tris, 192 mM glycine and 20% (v/v) methanol). The membranes were probed with specific primary and secondary antibodies in 5% BSA–TBST.

RNA fluorescence in situ hybridization

RNA–FISH was performed using custom BaseScope probes, RNAscope Pretreatment Reagents, Wash Buffer Reagents and BaseScope Detection Reagents v2-Red (ACD, Supplementary Table 11) following the manufacturer’s protocol. Briefly, 200,000 Hep3B and 300,000 SNU398 cells were seeded on glass coverslips in six-well plates. The cells were grown to 50–70% confluency, fixed in 10% formalin for 30 min and subjected to a series of ethanol dehydration and rehydration steps, hydrogen peroxide and RNAscope Protease III treatment. They were subsequently hybridized with the custom ACD probes, followed by a series of signal amplification steps, signal detection using the BaseScope Fast RED dye and counterstaining with DAPI (Thermo Fisher). The glass coverslips were mounted using ProLong Gold Antifade Mountant (Thermo Fisher). Fluorescence images were acquired at 60× magnification using the Olympus FV1200 confocal microscope and Fluoview (v3.0) and processed using ImageJ (v1.51j8).

RNA immunoprecipitation

The protocol was adapted from the RIP–ChIP protocol described previously⁴⁵. Briefly, protein A Sepharose beads (Sigma) were coated with 3 µg of U1 snRNP, SF3B1, SRSF1, U2AF2 or mouse IgG antibody (Santa Cruz), followed by incubation with 2 mg of Hep3B or SNU398 total cell lysates overnight. The RNA–protein–bead complexes were washed once with NT2 crowders (25 mg Ficoll PM400 (GE Healthcare), 75 mg Ficoll PM70 (GE Healthcare) and 2.5 mg dextran sulfate (Fluka) in 10 ml of NT2 buffer) and five times with NT2 buffer (50 mM Tris pH 7.0, 150 mM NaCl, 1 mM MgCl₂ and 0.05% (v/v) NP-40). Protein–RNA complexes were collected in 100 μl of NET2 buffer (1 mM DTT, 16.7 mM EDTA, 200 U RNaseOUT (Thermo Fisher) and 100 U SUPERase In (Ambion) in 1× NT2 crowder), supplemented with 100 μl of 2× SDS–TE (100 mM Tris pH 7.5, 10 mM EDTA pH 8.0 and 1% SDS). RNA was isolated using TRIzol reagent and subsequently purified with phenol:chloroform:isoamyl alcohol (25:24:1) and chloroform:isoamyl alcohol (24:1).

RNA pulldown

RNA pulldown using biotinylated probes was performed as described previously with slight modifications⁴⁶. In brief, cell lysates were prepared in lysis buffer (25 mM Tris pH 7.5, 150 mM NaCl, 1 mM EDTA, 1% (v/v) NP-40), 5% (v/v) glycerol and 100 U ml⁻¹ SUPERase In protease inhibitor). Then, 50 μl of Dynabeads MyOne Streptavidin C1 (per sample) were pre-washed and blocked in 500 μl of lysis buffer supplemented with 0.2 μg μl⁻¹ yeast tRNA (Thermo Fisher) for 2 h at 4 °C. Meanwhile, 4 μg of biotinylated probe (Supplementary Table 8) was incubated with 1 mg of lysate per reaction for 1 h at room temperature with rotation, after which blocked beads were washed and added to the lysate–probe mix and incubated for another 2 h at room temperature with rotation. The RNA–protein–bead complexes were washed six times with lysis buffer. Proteins were eluted in 1× reducing sample buffer and loaded on 8% SDS–PAGE gels for western blot analysis.

Nuclear–cytoplasmic fractionation

Nuclear–cytoplasmic fractionation was performed following the Abcam protocol with some modifications. Briefly, cell pellets were lysed in Buffer A (10 mM HEPES, 1.5 mM MgCl₂, 10 mM KCl, 0.5 mM DTT and 0.05% (v/v) NP-40, pH 7.5). Following a 10 min incubation on ice and centrifugation, supernatants were collected as the cytoplasmic fractions. The remaining pellets were resuspended in Buffer B (5 mM HEPES, 1.5 mM MgCl₂, 0.2 mM EDTA, 0.5 mM DTT and 26% (v/v) glycerol, pH 7.5) and 750 mM NaCl. The suspension was homogenized on ice using a handheld homogenizer. After a 30 min incubation on ice and centrifugation, the resulting supernatants were collected as the nuclear fractions. The fractions were divided and processed to collect RNA and proteins as described above.

Polysome extraction and fractionation

Polysome extraction, fractionation and RNA extraction were performed as previously described with some modifications⁴⁷. Gradient centrifugation was performed for 1.5 h, followed by polysome fraction collection and RNA extraction. cDNA was prepared using the High Capacity cDNA Reverse Transcription Kit (Thermo Fisher) and the same volume of RNA across all fractions (calculated from 1 μg of RNA based on the highest concentration among all fractions). RT–qPCR was performed as described above.

Statistics and reproducibility

All statistical analyses for experimental data were performed using Excel from Microsoft 365 v16 and statistical significance was considered at test level P < 0.05. P values were calculated using unpaired two-tailed Student’s t-test unless otherwise stated. All experiments were performed independently at least twice by two or more investigators with reproducible results. No statistical method was used to pre-determine sample size. Eighty RNA-seq samples from GTEx were excluded because of low sequencing depth (total number of splicing junctions <1,000). Sample numbers before and after exclusion are presented in Supplementary Table 1. No data were excluded from the analyses of biological experiments. Randomization was applied to all in vivo experiments but not in vitro experiments as it was not necessary. Blinding was not applied to computational analyses as these were performed using unbiased software programs or algorithms. Blinding was applied to the data collection of at least one set of each experiment except for RNA–FISH owing to the experimental technicality and licence requirement for confocal microscopy.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data Availability

The PLANet RNA-seq dataset of 211 samples was generated in the PLANet study and deposited in the European Genome-phenome Archive (EGA, http://www.ebi.ac.uk/ega/) under the accession code EGAS00001003813 (ref. ¹⁴). All c3USPs across ten TCGA cancer types and their corresponding normal tissues, as well as the patterns of SPL, can be found on the SpUR database: http://www.cbrc.kaust.edu.sa/spur/home. It also provides a function to query the association between 3′ UTR SPLs and prognosis in each cancer type. RNA-seq data of the in-house four HCC matched pairs and 55 AML and healthy control samples analysed in this study have been deposited in Sequence Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra) with the accession number PRJNA602213. RNA-seq data from PacBio used in this study are released on its official website and can be downloaded from the links below: http://datasets.pacb.com.s3.amazonaws.com/2014/Iso-seq_Human_Tissues/list.html; http://datasets.pacb.com.s3.amazonaws.com/2013/IsoSeqHumanMCF7Transcriptome/list.html The genome and gene annotation used for the alignment were downloaded from GENCODE (https://www.gencodegenes.org). Source data are provided with this paper. All other data supporting the findings of this study are available from the corresponding author on reasonable request.

Code availability

The custom code to extract, filter and analyse 3USPs is publicly available at https://github.com/christear/RNASeq3USP, which requires the use of BEDTools (v2.29), featureCounts (v1.6.1), SAMtools (v1.8), Perl (v5.26) and R (v4.1.2).

References

Mayr, C. What are 3’ UTRs doing? Cold Spring Harb. Perspect. Biol. 11, a034728 (2018).
Article CAS Google Scholar
Kataoka, K. et al. Aberrant PD-L1 expression through 3′-UTR disruption in multiple cancers. Nature 534, 402–406 (2016).
Article CAS PubMed Google Scholar
Schuster, S. L. & Hsieh, A. C. The untranslated regions of mRNAs in cancer. Trends Cancer 5, 245–262 (2019).
Article CAS PubMed PubMed Central Google Scholar
Mayr, C. & Bartel, D. P. Widespread shortening of 3′ UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673–684 (2009).
Article CAS PubMed PubMed Central Google Scholar
Lianoglou, S., Garg, V., Yang, J. L., Leslie, C. S. & Mayr, C. Ubiquitously transcribed genes use alternative polyadenylation to achieve tissue-specific expression. Genes Dev. 27, 2380–2396 (2013).
Article CAS PubMed PubMed Central Google Scholar
Park, H. J. et al. 3′ UTR shortening represses tumor-suppressor genes in trans by disrupting ceRNA crosstalk. Nat. Genet. 50, 783–789 (2018).
Article CAS PubMed PubMed Central Google Scholar
Berkovits, B. D. & Mayr, C. Alternative 3′ UTRs act as scaffolds to regulate membrane protein localization. Nature 522, 363–367 (2015).
Article CAS PubMed PubMed Central Google Scholar
Ma, W. & Mayr, C. A membraneless organelle associated with the endoplasmic reticulum enables 3’UTR-mediated protein–protein interactions. Cell 175, 1492–1506, e1419 (2018).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. H. & Mayr, C. Gain of additional BIRC3 protein functions through 3′-UTR-mediated protein complex formation. Mol. Cell. 74, 701–712.e9 (2019).
Article CAS PubMed PubMed Central Google Scholar
Obeng, E. A., Stewart, C. & Abdel-Wahab, O. Altered RNA processing in cancer pathogenesis and therapy. Cancer Discov. 9, 1493–1510 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pan, Q., Shai, O., Lee, L. J., Frey, B. J. & Blencowe, B. J. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 1413–1415 (2008).
Article CAS PubMed Google Scholar
Malka, Y. et al. Post-transcriptional 3′-UTR cleavage of mRNA transcripts generates thousands of stable uncapped autonomous RNA fragments. Nat. Commun. 8, 2029 (2017).
Article PubMed PubMed Central CAS Google Scholar
Ryan, M. et al. TCGASpliceSeq a compendium of alternative mRNA splicing in cancer. Nucleic Acids Res. 44, D1018–1022 (2016).
Article CAS PubMed Google Scholar
Zhai, W. et al. Dynamic phenotypic heterogeneity and the evolution of multiple RNA subtypes in hepatocellular carcinoma: the PLANET study. Natl Sci. Rev. 9, nwab192 (2021).
Article PubMed PubMed Central Google Scholar
Nusse, R. & Clevers, H. Wnt/β-catenin signaling, disease, and emerging therapeutic modalities. Cell 169, 985–999 (2017).
Article CAS PubMed Google Scholar
Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA-binding proteins. Nature 583, 711–719 (2020).
Article PubMed PubMed Central CAS Google Scholar
Thiele, A., Nagamine, Y., Hauschildt, S. & Clevers, H. AU-rich elements and alternative splicing in the β-catenin 3′UTR can influence the human β-catenin mRNA stability. Exp. Cell. Res. 312, 2367–2378 (2006).
Article CAS PubMed Google Scholar
Mount, S. M., Pettersson, I., Hinterberger, M., Karmas, A. & Steitz, J. A. The U1 small nuclear RNA-protein complex selectively binds a 5′ splice site in vitro. Cell 33, 509–518 (1983).
Article CAS PubMed Google Scholar
Takemura, R., Takeiwa, T., Taniguchi, I., McCloskey, A. & Ohno, M. Multiple factors in the early splicing complex are involved in the nuclear retention of pre-mRNAs in mammalian cells. Genes Cells 16, 1035–1049 (2011).
Article CAS PubMed Google Scholar
Prasanth, K. V. et al. Regulating gene expression through RNA nuclear retention. Cell 123, 249–263 (2005).
Article CAS PubMed Google Scholar
Montes, M., Sanford, B. L., Comiskey, D. F. & Chandler, D. S. RNA splicing and disease: animal models to therapies. Trends Genet. 35, 68–87 (2019).
Article CAS PubMed Google Scholar
Bicknell, A. A., Cenik, C., Chua, H. N., Roth, F. P. & Moore, M. J. Introns in UTRs: why we should stop ignoring them. Bioessays 34, 1025–1034 (2012).
Article CAS PubMed Google Scholar
Basu, S. K. et al. 3′UTR elements inhibit Ras-induced C/EBPβ post-translational activation and senescence in tumour cells. EMBO J. 30, 3714–3728 (2011).
Article CAS PubMed PubMed Central Google Scholar
Basu, S. K. et al. A RAS–CaMKKβ–AMPKα2 pathway promotes senescence by licensing post-translational activation of C/EBPβ through a novel 3′UTR mechanism. Oncogene 37, 3528–3548 (2018).
Article CAS PubMed PubMed Central Google Scholar
Buratti, E. & Baralle, F. E. Influence of RNA secondary structure on the pre-mRNA splicing process. Mol. Cell. Biol. 24, 10505–10514 (2004).
Article CAS PubMed PubMed Central Google Scholar
Kim, D. D. et al. Widespread RNA editing of embedded Alu elements in the human transcriptome. Genome Res. 14, 1719–1725 (2004).
Article CAS PubMed PubMed Central Google Scholar
Tang, S. J. et al. Cis- and trans-regulations of pre-mRNA splicing by RNA editing enzymes influence cancer development. Nat. Commun. 11, 799 (2020).
Article CAS PubMed PubMed Central Google Scholar
Quemener, A. M. et al. The powerful world of antisense oligonucleotides: from bench to bedside. Wiley Interdiscip. Rev. RNA 11, e1594 (2020).
PubMed Google Scholar
Crooke, S. T., Witztum, J. L., Bennett, C. F. & Baker, B. F. RNA-targeted therapeutics. Cell Metab. 27, 714–739 (2018).
Article CAS PubMed Google Scholar
Sternburg, E. L. & Karginov, F. V. Global approaches in studying RNA-binding protein interaction networks. Trends Biochem. Sci. 45, 593–603 (2020).
Article CAS PubMed Google Scholar
Denichenko, P. et al. Specific inhibition of splicing factor activity by decoy RNA oligonucleotides. Nat. Commun. 10, 1590 (2019).
Article PubMed PubMed Central CAS Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).
Article CAS PubMed Google Scholar
Chen, T. et al. A missense mutation in SNRPE linked to non-syndromal microcephaly interferes with U snRNP assembly and pre-mRNA splicing. PLoS Genet. 15, e1008460 (2019).
Article PubMed PubMed Central CAS Google Scholar
Karolchik, D. et al. The UCSC genome browser database. Nucleic Acids Res. 31, 51–54 (2003).
Article CAS PubMed PubMed Central Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Hu, B., Yang, Y. T., Huang, Y., Zhu, Y. & Lu, Z. J. POSTAR: a platform for exploring post-transcriptional regulation coordinated by RNA-binding proteins. Nucleic Acids Res. 45, D104–D114 (2017).
Article CAS PubMed Google Scholar
Zhang, B. et al. A comprehensive expression landscape of RNA-binding proteins (RBPs) across 16 human cancer types. RNA Biol. 17, 211–226 (2020).
Article CAS PubMed Google Scholar
Lever, J., Zhao, E. Y., Grewal, J., Jones, M. R. & Jones, S. J. M. CancerMine: a literature-mined resource for drivers, oncogenes and tumor suppressors in cancer. Nat. Methods 16, 505–507 (2019).
Article CAS PubMed Google Scholar
Chakravarty, D. et al. OncoKB: a precision oncology knowledge base. JCO Precis. Oncol. https://doi.org/10.1200/PO.17.00011 (2017).
Cagatay, T. & Ozturk, M. P53 mutation as a source of aberrant β-catenin accumulation in cancer cells. Oncogene 21, 7971–7980 (2002).
Article CAS PubMed Google Scholar
Morin, P. J. et al. Activation of β-catenin-Tcf signaling in colon cancer by mutations in β-catenin or APC. Science 275, 1787–1790 (1997).
Article CAS PubMed Google Scholar
Kwok, Z. H., Roche, V., Chew, X. H., Fadieieva, A. & Tay, Y. A non-canonical tumor suppressive role for the long non-coding RNA MALAT1 in colon and breast cancers. Int. J. Cancer 143, 668–678 (2018).
Article CAS PubMed Google Scholar
Chan, J. J. et al. A FTH1 gene:pseudogene:microRNA network regulates tumorigenesis in prostate cancer. Nucleic Acids Res. 46, 1998–2011 (2018).
Article CAS PubMed Google Scholar
Yoon, J. H., Srikantan, S. & Gorospe, M. MS2-TRAP (MS2-tagged RNA affinity purification): tagging RNA to identify associated miRNAs. Methods 58, 81–87 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chen, S. et al. Widespread and functional RNA circularization in localized prostate cancer. Cell 176, 831–843, e822 (2019).
Article CAS PubMed Google Scholar
Rahim, A. B. & Vardy, L. A. Analysis of mRNA translation rate in mouse embryonic stem cells. Methods Mol. Biol. 1341, 143–155 (2016).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We thank all past and present Y.T. lab members for their constructive feedback on this project, V. Teh for providing the mouse RNA samples, the NUHS Leukemia Cell Bank for the AML samples and S. J. Tang for help with ASO design. The computational analysis in this study is supported by National Supercomputing Centre Singapore (NSCC). Y.T. is funded by NMRC OF-IRGs (NMRC/OFIRG/MOH-000380, MOH-000923), the National Research Foundation Singapore and the Singapore Ministry of Education under its Research Centres of Excellence initiative, and the RNA Biology Center at the Cancer Science Institute of Singapore, NUS, as part of funding under the Singapore Ministry of Education’s AcRF Tier 3 grants (MOE2014-T3-1-006). Singapore National Medical Research Council grants (TCR/015-NCC/2016 and NMRC/CSA-SI/0018/2017): P.K.H.C. King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) (BAS/1/1624-01, FCC/1/1976-23-01, FCC/1/1976-26-01, REI/1/0018-01-01, REI/1/4216-01-01, REI/1/4437-01-01, REI/1/4473-01-01 and URF/1/4098-01-01): X.G.

Author information

Zhi Hao Kwok
Present address: Division of Pulmonary and Critical Care Medicine, Department of Medicine, Boston University, Boston, MA, USA
Angela Siemens
Present address: The University of British Columbia, Vancouver, British Columbia, Canada
Avencia Sanchez-Mejias
Present address: Integra Therapeutics S.L., Barcelona, Spain
These authors contributed equally: Jia Jia Chan, Bin Zhang.

Authors and Affiliations

Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
Jia Jia Chan, Bin Zhang, Xiao Hong Chew, Zhi Hao Kwok, Chun You Lim, Ng Desi, Angela Siemens, Tyas Kinanti, Shane Ong, Avencia Sanchez-Mejias, Omer An, Xiaonan Fan, Yufen Goh, Wee Nih Fam, Melissa J. Fullwood, Wee Joo Chng, Jason J. Pitt, Leilei Chen, Henry Yang & Yvonne Tay
Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
Bin Zhang, Adil Salhi & Xin Gao
KAUST Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Bin Zhang, Adil Salhi & Xin Gao
A*STAR Skin Research Labs and Skin Research Institute of Singapore, A*STAR, Immunos, Singapore, Singapore
Nagavidya Subramaniam & Leah A. Vardy
School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
Phuong Thao Ly, Melissa J. Fullwood & Xavier Roca
Department of Haematology-Oncology, National University Cancer Institute of Singapore, National University Health System, Singapore, Singapore
Raghav Sundar, Melissa G. Ooi & Wee Joo Chng
Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
Raghav Sundar
Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Raghav Sundar
The N.1 Institute for Health, National University of Singapore, Singapore, Singapore
Raghav Sundar
Department of Pathology, National University Health System, Singapore, Singapore
Shi Wang
Department of Surgery, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Bei En Siew, Kuok Chung Lee, Choon Seng Chong, Bettina Lieske, Wai-Kit Cheong, Shridhar Ganpathi Iyer, Glenn Kunnath Bonney & Ker-Kan Tan
Division of Colorectal Surgery, University Surgical Cluster, National University Health System, Singapore, Singapore
Kuok Chung Lee, Choon Seng Chong, Bettina Lieske, Wai-Kit Cheong & Ker-Kan Tan
Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Melissa G. Ooi, Wee Joo Chng & Yock Young Dan
Department of Orthopaedic Surgery, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Bryan T. H. Koh & Wilson Wang
Division of Hepatobiliary & Pancreatic Surgery, University Surgical Cluster, National University Health System, Singapore, Singapore
Shridhar Ganpathi Iyer & Glenn Kunnath Bonney
Program in Clinical and Translational Liver Cancer Research, National Cancer Center Singapore, Singapore, Singapore
Wen Huan Ling, Weiwei Zhai & Pierce K. H. Chow
Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Jianbin Chen
Department of Surgery, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
Boon-Koon Yoong
Hepato-Pancreato-Biliary Surgery Unit, Department of Surgery, National Cancer Institute, Bangkok, Thailand
Rawisak Chanwat
Department of Hepatopancreatobiliary and Transplantation Surgery, Singapore General Hospital and National Cancer Center Singapore, Singapore, Singapore
Brian K. P. Goh & Pierce K. H. Chow
Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
Weiwei Zhai
Institute of Molecular and Cell Biology, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Melissa J. Fullwood
NUS Centre for Cancer Research, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Jason J. Pitt & Yvonne Tay
Center for Therapeutics Discovery, Department of Oncological Sciences and Pharmacological Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
Ernesto Guccione
Department of Anatomy, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Leilei Chen
BioMap, Beijing, China
Xin Gao
Academic Clinical Programme for Surgery, SingHealth Duke-NUS Academic Medical Centre (AMC), Singapore, Singapore
Pierce K. H. Chow
Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Yvonne Tay

Authors

Jia Jia Chan
View author publications
You can also search for this author in PubMed Google Scholar
Bin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Hong Chew
View author publications
You can also search for this author in PubMed Google Scholar
Adil Salhi
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Hao Kwok
View author publications
You can also search for this author in PubMed Google Scholar
Chun You Lim
View author publications
You can also search for this author in PubMed Google Scholar
Ng Desi
View author publications
You can also search for this author in PubMed Google Scholar
Nagavidya Subramaniam
View author publications
You can also search for this author in PubMed Google Scholar
Angela Siemens
View author publications
You can also search for this author in PubMed Google Scholar
Tyas Kinanti
View author publications
You can also search for this author in PubMed Google Scholar
Shane Ong
View author publications
You can also search for this author in PubMed Google Scholar
Avencia Sanchez-Mejias
View author publications
You can also search for this author in PubMed Google Scholar
Phuong Thao Ly
View author publications
You can also search for this author in PubMed Google Scholar
Omer An
View author publications
You can also search for this author in PubMed Google Scholar
Raghav Sundar
View author publications
You can also search for this author in PubMed Google Scholar
Xiaonan Fan
View author publications
You can also search for this author in PubMed Google Scholar
Shi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bei En Siew
View author publications
You can also search for this author in PubMed Google Scholar
Kuok Chung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Choon Seng Chong
View author publications
You can also search for this author in PubMed Google Scholar
Bettina Lieske
View author publications
You can also search for this author in PubMed Google Scholar
Wai-Kit Cheong
View author publications
You can also search for this author in PubMed Google Scholar
Yufen Goh
View author publications
You can also search for this author in PubMed Google Scholar
Wee Nih Fam
View author publications
You can also search for this author in PubMed Google Scholar
Melissa G. Ooi
View author publications
You can also search for this author in PubMed Google Scholar
Bryan T. H. Koh
View author publications
You can also search for this author in PubMed Google Scholar
Shridhar Ganpathi Iyer
View author publications
You can also search for this author in PubMed Google Scholar
Wen Huan Ling
View author publications
You can also search for this author in PubMed Google Scholar
Jianbin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Boon-Koon Yoong
View author publications
You can also search for this author in PubMed Google Scholar
Rawisak Chanwat
View author publications
You can also search for this author in PubMed Google Scholar
Glenn Kunnath Bonney
View author publications
You can also search for this author in PubMed Google Scholar
Brian K. P. Goh
View author publications
You can also search for this author in PubMed Google Scholar
Weiwei Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Melissa J. Fullwood
View author publications
You can also search for this author in PubMed Google Scholar
Wilson Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ker-Kan Tan
View author publications
You can also search for this author in PubMed Google Scholar
Wee Joo Chng
View author publications
You can also search for this author in PubMed Google Scholar
Yock Young Dan
View author publications
You can also search for this author in PubMed Google Scholar
Jason J. Pitt
View author publications
You can also search for this author in PubMed Google Scholar
Xavier Roca
View author publications
You can also search for this author in PubMed Google Scholar
Ernesto Guccione
View author publications
You can also search for this author in PubMed Google Scholar
Leah A. Vardy
View author publications
You can also search for this author in PubMed Google Scholar
Leilei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xin Gao
View author publications
You can also search for this author in PubMed Google Scholar
Pierce K. H. Chow
View author publications
You can also search for this author in PubMed Google Scholar
Henry Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yvonne Tay
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.J.C. performed experiments and analysed data. B.Z. performed all computational analyses. X.H.C., Z.H.K., C.Y.L., N.D., A. Siemens, T.K. and S.O. performed experiments. O.A. and H.Y. provided the TCGA data. A. Salhi and X.G. built the SpUR database. N.S. performed polysome fractionation experiments. A.S.-M., P.T.L., R.S., X.F., J.J.P., X.R., E.G., L.A.V., L.C. and H.Y. provided input for the project. S.W., B.E.S., K.C.L., C.S.C., B.L., W.-K.C. and K.-K.T. provided COAD clinical samples; Y.G., W.N.F., M.G.O., B.T.H.K., M.J.F., W.W. and W.J.C. provided the clinical AML samples and RNA-seq dataset; S.G.I. and Y.Y.D. provided HCC clinical samples for the in-house RNA-seq dataset; W.H.L., J.C., B.-K.Y., R.C., G.K.B., B.K.P.G., W.Z. and P.K.H.C. provided the HCC clinical samples and RNA-seq dataset from the PLANet study. J.J.C., B.Z. and Y.T. designed the study and prepared the manuscript. All authors reviewed and commented on the manuscript.

Corresponding author

Correspondence to Yvonne Tay.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Cell Biology thanks George Calin and the other, anonymous, reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Global analysis of 3’UTR splicing events with TCGA and GTEx datasets.

a, Heatmap illustrating the percentage overlap of c3USPs identified from TCGA-normal and GTEx samples. The rows and columns indicate the TCGA-normal and GTEx tissues, respectively. b, Boxplot showing the distribution of the percentage of 3’UTR splicing events identified in TCGA-normal that overlapped with those from GTEx, including 3USPs, c3USPs and their corresponding backgrounds. c-d, Bar plots showing the number of 3USPs and c3USPs detected in the TCGA-normal samples of different numbers of cancer types (c) and different tissues from GTEx (d). The x-axis indicate the number of cancer types (Supplementary Table 1). e, Proportion of novel c3USPs that are not annotated or reported in previous studies. f-j, Distribution of the distance from the stop codon for multiple groups of c3USPs (f-h) and 3USPs (i,j) classified by splicing levels in TCGA-tumor (f), TCGA-normal (g,i) and GTEx (h,j). k-l, Cumulative distribution of splicing levels of c3USPs in TCGA-normal (k) and GTEx (l) with or without Alu elements. P-values: Wilcoxon test.

Source data

Extended Data Fig. 2 3’UTR splicing is upregulated in cancer and correlated with poor prognosis.

a, Workflow of the analysis for each c3USP in each cancer type. b, Bar plot showing the number of genes with up and downregulated c3USPs. c, Bar plot illustrating the proportion of oncogenes and tumor suppressors in two groups of genes with either upregulated or downregulated c3USPs. P-values: hypergeometric test. d, Heatmap illustrating p-values derived from overlapping c3USPs between each two cancer types (hypergeometric test). P-values >0.001 are labeled. e, Bar plot showing the number of favorable and unfavorable prognostic c3USPs across different cancer types derived from the Kaplan-Meier method with a p < 0.05 cutoff. f, Bar plot illustrating the proportion of favorable and unfavorable markers in upregulated and downregulated c3USPs. g-i, Sashimi plots illustrating 3’UTR splicing of the top dysregulated splicing event of CTNNB1 in the normal and tumor samples from AML (g), TCGA-BRCA (h) and -LIHC (i). j, Number of 3’UTR splicing events in all (left), and 50 matched tumor-normal samples (right) from the TCGA-LIHC dataset. n: number of RNA-seq samples analyzed; P-values: Mann-Whitney U test. k, Kaplan-Meier survival analysis of TCGA-LIHC samples based on the segmentation numbers of total splicing events (top and bottom half of samples ranked by numbers). n: number of patients analyzed.

Extended Data Fig. 3 Targeted inhibition of 3’UTR splicing impedes HCC carcinogenesis.

a, Overlap of 3USPs identified from in-house and TCGA-LIHC data. b, Scatterplot illustrating the dysregulation of c3USPs from the PLANet and TCGA-LIHC datasets. c-f, Effect of ASO-mediated blocking of the 3’UTR splice site on candidate 3’SP transcript expression by qPCR (n = 3 independent experiments) (c), candidate 3’FL, 3’SP and CDS transcript expression by PCR (◄ 3’SP2, see Supplementary Note) (d), protein expression (e) and anchorage-independent growth (n = 3 independent experiments) (f) in HepG2. CDS: coding sequence; 3’FL: full length 3’UTR; 3’SP: spliced 3’UTR; ASO-NC: non-targeting control ASO; ASO-SS: splice site ASO. c,f, Mean ± SEM; unpaired Student’s t-test *p < 0.05, **p < 0.01, ***p < 0.001. d,e, Data shown represent three independent experiments.

Source data

Extended Data Fig. 4 Targeted inhibition of 3’UTR splicing impedes HCC cell proliferation.

a-e, Effect of ASO-mediated blocking of the 3’UTR splice site on the protein expression of cell cycle markers (a), candidate transcript expression by qPCR (n = 3 independent experiments) (b), and PCR (◄ 3’SP2, see Supplementary Note) (c), candidate protein expression (d) and anchorage-independent growth (n = 3 independent experiments) (e) in Hep3B and HepG2. CDS: coding sequence; 3’FL: full length 3’UTR; 3’SP: spliced 3’UTR; ASO-NC: non-targeting control ASO; ASO-SS: splice site ASO. b,e, Mean ± SEM; unpaired Student’s t-test *p < 0.05, **p < 0.01, ***p < 0.001. a,c,d, Data shown represent three independent experiments.

Source data

Extended Data Fig. 5 CTNNB1 3’SP2 does not affect CTNNB1 expression and HCC tumorigenesis.

a-c, Effect of siRNA-mediated knockdown of CTNNB1 3’SP2 on CTNNB1 transcript (n = 2 independent experiments) (a) and protein (b) expression, and anchorage-independent growth (n = 3 independent experiments) (c) in Hep3B and SNU398. d, Luciferase activity of reporter constructs with CTNNB1 3’FL, 3’SP and 3’SP2 (n = 3 independent experiments). e, PCR analysis of 3’FL and 3’SP expression in mouse adjacent normal and liver tumor samples (left). Alignment of the 3’UTR splice junctions and flanking regions of human CTNNB1 (hsCTNNB1 NCBI RefSeq) and the mouse CTNNB1 3’UTR splice variant (spliced-PCR, detected via Sanger sequencing of the PCR product) (right). CDS and 3’UTR in upper- and lowercase, human 5’ and 3’ exons in black and green, and mouse in blue and red, respectively. siNC: siRNA non-targeting control; CDS: coding sequence; 3’FL: full length; 3’SP: spliced 3’UTR. c,d, Mean ± SEM; unpaired Student’s t-test *p < 0.05, **p < 0.01, ***p < 0.001. b,e, Data shown represent three independent experiments.

Source data

Extended Data Fig. 6 3’UTR splicing of CTNNB1 promotes HCC tumorigenesis.

a, Gene set enrichment analysis (GSEA) showing the enrichment of upregulated genes from the WNT signaling and mitotic cell cycle gene sets in the PLANet tumor samples with CTNNB1 3’UTR over-splicing. b-g, Effect of siRNA-mediated knockdown of CTNNB1 CDS, 3’FL and 3’SP on CTNNB1 transcript (n = 3 independent experiments) (b) and protein expression (c) in SNU398, anchorage-independent growth in Hep3B and SNU398 (n = 3 independent experiments) (d), cell migration (n = 2 independent experiments) (e), WNT target transcript (n = 3 independent experiments) (f) and protein expression (g) in SNU398. h, Chromatograms depicting Sanger sequencing validation of the negative control and CRISPR-Cas9-mediated T > G mutation (highlighted) of the CTNNB1 3’UTR splice site (CRISPR-SS mutant) at the genomic level. The red arrow indicates the G > C mutation introduced to the PAM sequence to prevent further Cas9 cleavage. i, Effect of the CRISPR-SS mutation on CTNNB1 transcript expression (n = 3 independent experiments). siNC: siRNA non-targeting control; CDS: coding sequence; 3’FL: full length; 3’SP: spliced 3’UTR. b,d,f,i, Mean ± SEM; unpaired Student’s t-test *p < 0.05, **p < 0.01, ***p < 0.001. c,g, Data shown represent three independent experiments.

Source data

Extended Data Fig. 7 3’UTR splicing of CTNNB1 promotes HCC tumorigenesis.

a-d, Effect of siRNA-mediated knockdown of CTNNB1 CDS, 3’FL and 3’SP on CTNNB1 transcript expression by qPCR (n = 3 independent experiments) (a) and PCR (b), protein expression (c) and anchorage-independent growth (n = 3 independent experiments) (d). siNC: siRNA non-targeting control; CDS: coding sequence; 3’FL: full length 3’UTR; 3’SP: spliced 3’UTR. a,d, Mean ± SEM; unpaired Student’s t-test *p < 0.05, **p < 0.01, ***p < 0.001. b,c, Data shown represent three independent experiments.

Source data

Extended Data Fig. 8 3’UTR splicing of CTNNB1 promotes COAD tumorigenesis.

a, Proportion of 3’UTR splicing events in all (left), and 41 matched tumor-normal samples (right) from the TCGA-COAD dataset. P-values: Mann-Whitney U test; n: number of RNA-seq samples analyzed. b,c, Kaplan-Meier survival analysis of TCGA-COAD samples based on the segmentation numbers of 3USPs (b) and total splicing events (c) (top and bottom half of samples ranked by numbers). n: number of patients analyzed. d, Proportion of CTNNB1 3’UTR splicing events in all normal and tumor samples from the TCGA-COAD dataset. n: number of RNA-seq samples analyzed. e, Comparison of the CTNNB1 3’SP transcript expression between paired normal and COAD clinical samples (n = 12 patient samples). f-i, Effect of ASO-mediated blocking of the 3’UTR splice site on CTNNB1 transcript expression by qPCR (n = 3 independent experiments) (f), and PCR (g), protein expression (h) and anchorage-independent growth (n = 3 independent experiments) (i) in DLD-1. j-m, Effect of siRNA-mediated knockdown of CTNNB1 on CTNNB1 transcript (n = 3 independent experiments) (j), CTNNB1 and WNT target proteins (k) and WNT target transcript (n = 3 independent experiments) (l) expression, and anchorage-independent growth (n = 3 independent experiments) (m) in DLD-1. ASO-NC: non-targeting control ASO; ASO-SS: splice site ASO; CDS: coding sequence; 3’FL: full length 3’UTR; 3’SP spliced 3’UTR. e,f,i,j,l,m, Mean ± SEM; unpaired Student’s t-test *p < 0.05, **p < 0.01, ***p < 0.001. g,h,k, Data shown represent three independent experiments.

Source data

Extended Data Fig. 9 3’UTR splicing may enhance CTNNB1 protein expression in HCC.

a, Number of c3USPs that are significantly up/downregulated upon the knockdown of RBPs compared to controls. b-e, Effect of the knockdown of splicing factors on the individual RBP transcript expression (n = 3 independent experiments) (b) and CDS exon-exon junctions (n = 3 independent experiments) (c) in Hep3B and SNU398, CTNNB1 transcript (n = 3 independent experiments) (d) and CTNNB1 protein (data shown represent three independent experiments) (e) expression in SNU398. f, Schematic representation of the CTNNB1 CDS, CDS + 3’FL and CDS + 3’SP overexpression constructs. g, Effect of overexpressing CTNNB1 CDS, CDS + 3’FL and CDS + 3’SP on CTNNB1 transcript expression in Hep3B and SNU398 (n = 3 independent experiments). siNC: siRNA non-targeting control; CDS: coding sequence; 3’FL: full length 3’UTR; 3’SP: spliced 3’UTR. b-d,g, Mean ± SEM; unpaired Student’s t-test *p < 0.05, **p < 0.01, ***p < 0.001.

Source data

Extended Data Fig. 10 3’UTR splicing-mediated cytoplasmic localization enhances CTNNB1 expression.

a,b, Effect of overexpressing CTNNB1 CDS, CDS + 3’FL and CDS + 3’SP on endogenous CTNNB1 transcript (n = 3 independent experiments) (a) and exogenous CTNNB1 protein (b) expression in DLD-1. c, Luciferase activity of reporter constructs with CTNNB1 3’FL and 3’SP in DLD-1 (n = 3 independent experiments). d, Effect of actinomycin D (ActD) treatment on the transcript levels of exogenously expressed CTNNB1 CDS, CDS + 3’FL and CDS + 3’SP in Hep3B and SNU398 (n = 3 independent experiments). e,f, Effect of cycloheximide (CHX) (e) or MG132 (f) treatment on exogenously expressed CTNNB1 protein levels of in Hep3B and SNU398. g, Subcellular distribution of CHEK1 3’FL and 3’SP transcripts following nuclear-cytoplasmic fractionation of Hep3B and SNU398 cells (n = 3 independent experiments). MALAT1 was used as a nuclear control. The 3’FL:3’SP transcript ratios in each cellular compartment are shown in the table below. h, RNA-FISH showing transcript localization of CHEK1 3’FL and 3’SP in SNU398. EV: empty vector; CDS: coding sequence; 3’FL: full length 3’UTR; 3’SP spliced 3’UTR. a,c,d,g, Mean ± SEM; unpaired Student’s t-test *p < 0.05, **p < 0.01, ***p < 0.001. b,e,f,h, Data shown represent three independent experiments.

Source data

Supplementary information

Supplementary Information

Supplementary Notes.

Reporting Summary

Peer Review File

Supplementary Table

Supplementary Table 1. General statistics of the 3USPs and c3USPs identified from the TCGA and GTEx datasets. Supplementary Table 2. Significantly dysregulated cUSPs across ten cancer types. Related to Fig. 1g. Supplementary Table 3. Prognosis-associated c3USPs across ten cancer types. Related to Fig. 1i. Supplementary Table 4. Number of tumour samples with over-spliced c3USPs across ten cancer types. Related to Fig. 2b. Supplementary Table 5. General sequencing statistics of the four paired in-house HCC samples. Supplementary Table 6. BLAST results of the CTNNB1 3′ FL and 3′ SP-targeting siRNA sequence. Related to Fig. 5. Supplementary Table 7. Putative RBP binding sites at the terminal exon of CTNNB1. Related to Fig. 6. Supplementary Table 8. Reagents (antibodies, siRNAs, ASOs and CRISPR–Cas9 system). Related to Methods section ‘Reagents’. Supplementary Table 9. PCR primers used for the cloning of psiCHECK-2 and pcDNA3.1 plasmids. Related to Methods section ‘Plasmids and mutagenesis’. Supplementary Table 10. Primers used for PCR and qPCR experiments. Related to Methods section ‘RNA extraction and RT–qPCR’. Supplementary Table 11. RNA–FISH reagents (ACD). Related to Methods section ‘RNA fluorescence in situ hybridization’.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.