Advertisement
Review Free access | 10.1172/JCI151627
Division of Cancer Pathobiology, Children’s Hospital of Philadelphia, and Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Address correspondence to: Andrei Thomas-Tikhonenko, Children’s Hospital of Philadelphia, 4056 Colket Translational Research Bldg., 3501 Civic Center Blvd., Philadelphia, PA 19104, USA. Email: andreit@pennmedicine.upenn.edu.
Find articles by Choi, P. in: JCI | PubMed | Google Scholar |
Division of Cancer Pathobiology, Children’s Hospital of Philadelphia, and Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Address correspondence to: Andrei Thomas-Tikhonenko, Children’s Hospital of Philadelphia, 4056 Colket Translational Research Bldg., 3501 Civic Center Blvd., Philadelphia, PA 19104, USA. Email: andreit@pennmedicine.upenn.edu.
Find articles by Thomas-Tikhonenko, A. in: JCI | PubMed | Google Scholar |
Published September 15, 2021 - More info
Herculean efforts by the Wellcome Sanger Institute, the National Cancer Institute, and the National Human Genome Research Institute to sequence thousands of tumors representing all major cancer types have yielded more than 700 genes that contribute to neoplastic growth when mutated, amplified, or deleted. While some of these genes (now included in the COSMIC Cancer Gene Census) encode proteins previously identified in hypothesis-driven experiments (oncogenic transcription factors, protein kinases, etc.), additional classes of cancer drivers have emerged, perhaps none more surprisingly than RNA-binding proteins (RBPs). Over 40 RBPs responsible for virtually all aspects of RNA metabolism, from synthesis to degradation, are recurrently mutated in cancer, and just over a dozen are considered major cancer drivers. This Review investigates whether and how their RNA-binding activities pertain to their oncogenic functions. Focusing on several well-characterized steps in RNA metabolism, we demonstrate that for virtually all cancer-driving RBPs, RNA processing activities are either abolished (the loss-of-function phenotype) or carried out with low fidelity (the LoFi phenotype). Conceptually, this suggests that in normal cells, RBPs act as gatekeepers maintaining proper RNA metabolism and the “balanced” proteome. From the practical standpoint, at least some LoFi phenotypes create therapeutic vulnerabilities, which are beginning to be exploited in the clinic.
The advent of various high-throughput genome analysis techniques culminated in the early 2010s in massively parallel sequencing of common human cancers, to confirm suspected and identify new cancer-driving events by virtue of their recurrent dysregulation across multiple tumors and histotypes. The current version of the Catalogue of Somatic Mutations in Cancer (COSMIC; ref. 1) contains 723 genes. According to the UniProt database (2), their top ten molecular functions include the “usual suspects”: cell surface receptors, protein kinases, transcription factors, etc. However, we were intrigued by the strong representation of RNA-binding proteins (RBPs), which constitute 5% of all mapped COSMIC genes (Figure 1) — 65 total. We further limited these 65 entries to 42 that were listed as “reviewed” in COSMIC and to 14 further classified as drivers by the cBioPortal for Cancer Genomics (3–5) (Figure 2 and Supplemental Figure 1; supplemental material available online with this article; https://doi.org/10.1172/JCI151627DS1).
RNA-binding proteins as cancer drivers. Distribution of the key molecular functions of the 723 genes in the COSMIC Cancer Gene Census (v92, released August 27, 2020). Molecular functions were obtained from the UniProt database.
Roles of COSMIC genes in RNA metabolism pathways. Listed in boxes are COSMIC Cancer Gene Census genes encoding proteins with RNA-binding activities and classified as tier 1 (“documented activity relevant to cancer”) or tier 2 (“less extensive available evidence”). RBPs further classified as drivers by cBioPortal are highlighted in red text. Although many RBPs function in multiple processes, each RBP was assigned to one primary step in RNA metabolism: transcription, splicing, microRNA biogenesis, nuclear export, folding/turnover, and translation. During transcription, the exact RNA copy of a protein-coding gene is synthesized by RNA polymerase II. It typically contains exon and introns; the latter are being continuously removed during splicing, yielding the mature messenger RNA (mRNA). Some introns (as well as occasional exons) contain short stem-loop structures that are recognized and excised by the Microprocessor complex during early stages of microRNA biogenesis. Both mRNAs and microRNAs are moved to the cytosol via nuclear export. Once in the cytosol, mRNAs undergo translation into proteins by the ribosomes; this process is tightly regulated by various RBPs and also by microRNAs, which bind to complementary sequences, typically in 3′-UTRs of mRNAs, and affect both mRNA stability and recognition of the 5′ cap structures by ribosomes.
Collectively, they pertain to virtually all aspects of RNA metabolism, from synthesis to degradation. Thus, in the following pages, we will focus on this highly curated list to ask whether and how their RNA-binding activities pertain to their oncogenic functions. For a much more comprehensive survey of RBPs, we refer the readers to several excellent review articles (6–9). We also acknowledge that many mutations currently classified as “passenger” might still play causal, if more modest, roles in cancer (10). Lastly, many RBPs involved in cancer might be regulated exclusively at the level of expression (by chromatin modifications, noncoding RNAs, etc.) (11, 12) and thus would not appear in the COSMIC database. Still, the 14 genes selected for detailed analyses provide a useful representation of how RBPs might be functioning in the context of neoplastic transformation. Our key conclusion is that for most cancer-driving RBPs, RNA binding is either abolished (the classical loss-of-function phenotype) or carried out imprecisely, which can be described as the “low-fidelity” (LoFi) phenotype. One important feature of these LoFi phenotypes is that they affect many molecular targets indiscriminately, often resulting in “death by a thousand cuts,” as opposed to “death by the smoking gun.”
While the involvement of transcription factors (TFs) in cancer is well documented, it is less common knowledge that some TFs also exhibit RNA-binding activities. Four TF-encoding genes appear on our master list (Table 1 and Supplemental Figure 2A). Because of their pronounced DNA-binding properties, one might ask how essential the RNA-binding activity is for neoplastic transformation. The short answer appears to be “not very,” and the surprising overall trend appears to be the loss or dysregulation of RNA binding during the course of neoplastic transformation.
SMARCA4. The transcription activator BRG1, one of the two alternative ATP-dependent catalytic subunits of the SWI/SNF chromatin remodeling complex (the other being SMARCA2/BRM), is encoded by SMARCA4 (13). In cancer, SMARCA4 is most frequently affected by deep (biallelic) deletions and truncating frameshift mutations, which generally result in loss of protein and its function, arguing that SMARCA4 is a tumor suppressor (TS) gene. Moreover, several known hotspot missense mutations (including the most common, T910/M/A/R) map to the SNF2 and helicase domains essential for catalytic activity (14) and are thought to inactivate the enzymatic function of BRG1 (Supplemental Figure 2A).
It is possible that these missense mutations affect the RNA-binding activity of BRG1, but this hypothesis would be hard to test, since there is no recognizable sequence-specific RNA-binding domain (RBD) and in fact SMARCA4/BRG1 is only known to bind to one RNA species: the Xist noncoding RNA (15). The prevalent model is that Xist expels BRG1 from the inactive X chromosome and in doing so antagonizes the SWI/SNF complex (16). If this model is correct, the RNA-binding activity of SMARCA4 might be an impediment to its function, rather than something it actively relies on.
SPEN. The split ends protein (also known as SHARP), a prototype member of the family of transcriptional repressors with the characteristic SPOC domain, is encoded by SPEN (17). Like SMARCA4, SPEN frequently accumulates truncating frameshift mutations, suggesting the underlying loss-of-function mechanism of dysregulation in various cancers. Unlike SMARCA4, this presumed TS does not accumulate deep deletion or identifiable hotspot missense mutations (Supplemental Figure 2A).
Interestingly, murine Spen is also an Xist-binding protein (18–20), but unlike BRG1, SPEN has four identifiable RNA recognition motifs (RRMs) at the N-terminus, suggesting that RNA binding is central to its functions. One of these is silencing of endogenous retroviruses (ERVs) by recruitment of chromatin remodelers to ERV loci (21). However, frameshift mutations seem to be randomly distributed along the length of the gene, suggesting that preservation, let alone enhancement, of RNA-binding activity is not driving cancer phenotypes.
WT1. The gene WT1 encodes Wilms tumor protein 1, a TF with well-recognized DNA-binding features such as Cys2His2 zinc fingers (ZFs). While it plays an important role in development (22), the underlying molecular mechanisms are quite complex. It was recognized early on that WT1 might be more than a TF (23), owing to the existence of distinct isoforms arising from alternative splicing (e.g., 17-codon insertion in exon 5) and additional modifications such as sumoylation (24). In the context of this discussion, the most relevant dichotomy is between the canonical and the so-called +KTS isoforms, with the latter showing an insertion of Lys-Thr-Ser next to ZF3 (25). This event is thought to alter the critical spacing between ZFs 3 and 4, which could abrogate DNA-binding properties of WT1 and redirect it toward becoming an RBP (26). Additionally, the +KTS isoform was shown to interact with several RBPs, such as RBM4, and localize to nuclear speckles, indicative of a potential role in splicing (27).
How are these properties relevant to cancer? Like SMARCA4 and SPEN, WT1 is most frequently affected by disabling splice site and truncating frameshift mutations, with “warmspots” at amino acids 369 and 381 found in acute myelogenous leukemia (AML) and several types of solid cancers (Supplemental Figure 2A). Based on this clustering, one could argue that the tumor-suppressive properties of WT1 map to its C-terminal KTS insertion, making RNA binding by WT1 irrelevant in cancer cells while potentially relevant for its tumor-suppressive activity in normal cells. Consistent with this idea, an early study reported a relative increase in the –KTS isoform in breast cancer compared with normal tissues (28).
In contrast, in desmoplastic small round cell tumor, there is a well-recognized translocation involving WT1 and EWSR1 (29), which preserves the KTS alternative splice sites (26). Thus, while the –KTS isoform could serve as a TF (for example, for the PDGFA gene; ref. 30), the +KTS isoform could still contribute to RNA-centric processes such as splicing, as is known to be the case with EWSR1.
EWSR1. The gene EWSR1 encodes EWS, a nuclear protein, which is typically grouped with FUS/TLS and TAF15 into the FET (formerly known as TET) family of gene expression regulators. Interestingly, while they interact with components of the transcriptional machinery and possess well-defined N-terminal activation domains (31), they lack classical DNA-binding domains and contain instead conserved RBDs (32, 33), as revealed by the photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation (PAR-CLIP) technique (34). Each of the FET proteins, including EWS, was bound to thousands of transcripts, with a large fraction of cross-linked clusters mapping to intronic regions. In addition, EWS binds several known splicing factors, suggesting that it could function to couple transcription and splicing (35). Additional efforts to profile the EWS-RNA interactome revealed its role in the processing of many primary microRNA (pri-miRNA) transcripts (36), expanding its reach to noncoding RNAs (37).
Regardless of the normal function of EWS, the EWSR1 gene is profoundly altered in several cancers. While it does not accumulate somatic mutations or copy number alterations, it is fused with select TF genes: the above-mentioned WT1 in small round cell sarcoma (reviewed in ref. 26) and ERG or, more frequently, FLI1 in Ewing’s sarcoma (38) and various other soft tissue tumors (39). In the process of being translocated, EWS inevitably loses its RBD and replaces it with a DNA-binding domain from its partner, reconstituting an active TF (Supplemental Figure 2A). These hybrid TFs could act by a variety of mechanisms, ranging from derepression of E2F target promoters (40) to causing R-loop formation in the chromatin, which interfere with DNA repair (41). Overall, this DNA binding accounts for most, if not all, transforming activity of EWS-FLI1 (42).
What about its RNA-binding activity? With the RBD gone, EWS-FLI1 can still interact with small nuclear ribonucleoproteins (43) and some splicing factors, but it also loses the ability to bind to others, including serine/arginine (SR) proteins (44) and YB-1 (45), which play important roles in proper exon assembly (see below). In addition, a report showed novel RNA-binding properties of EWS-FLI1 (46), and a follow-up paper demonstrated the effect of EWS-FLI1 on alternative splicing — via binding both to RNA and to several RBPs, including the COSMIC gene–encoded DDX5 (47). Several alternatively spliced transcripts have putative oncogenic functions, among them the recently identified noncanonical ARID1A-L isoform (48). In an interesting twist, the EWS-FLI1 transcript itself is subject to alternative slicing, which often disrupts its open reading frame and could be deemed a therapeutic vulnerability (49).
In summary, of the four proteins profiled in this section, only EWS-FLI1 possesses well-documented RNA-binding activity in cancer cells. Several papers support the notion that among genes dysregulated by EWS-FLI1 at the level of splicing are putative oncogenes. However, the most parsimonious explanation is that Ewing’s sarcoma pathogenesis is driven not so much by individual aberrantly spliced oncogenes, but rather by LoFi splicing affecting multiple TSs. This conceptual dichotomy, “death by the smoking gun” versus “death by a thousand cuts,” is highlighted in the following section, concerned with splicing factors.
The discovery of highly recurrent mutations in splicing factors (SFs) across a variety of tumor types has provided compelling genetic evidence for a direct causal relationship between splicing dysfunction and cancer. Here we focus on five mutated SFs with the strongest evidence for being driver events (Table 1 and Supplemental Figure 2B) and discuss their potential roles in promoting tumorigenesis. SF gene mutations in cancer are the subject of several excellent reviews (12, 50), and a recent pan-cancer genomic survey has nominated some additional genes awaiting further study (51).
U2AF1. The gene U2AF1 encodes the smaller subunit of the U2 auxiliary factor (U2AF) heterodimer that is involved in recognition of the 3′ splice site (52–54). Alterations in U2AF1 consist predominantly of heterozygous missense mutations at either of the two CCCH-type ZF domains involved in RNA-binding (55) (Supplemental Figure 2B). They appear to result in distinct splicing effects depending on the ZF affected (56, 57). U2AF1 interacts directly with the AG dinucleotide at the 3′ splice site, and accordingly, S34 mutations have been shown to alter preference upstream of this dinucleotide, with a bias for C(AG) over T(AG), while Q157 mutations alter preference downstream of the dinucleotide, with a bias for (AG)G over (AG)A (56, 58, 59). Surprisingly, single-cell genomic analyses have also discovered rare cases of U2AF1 S34 and Q157 mutations co-occurring in cis, although any potential cooperative effect awaits further characterization (60).
How do missense mutations in U2AF1 cause cancer? It remains difficult to predict the functional consequences of most splicing changes, and in the case of U2AF1, mutations affect splicing of many exons. Cross-linking and immunoprecipitation sequencing (CLIP-Seq) studies indicate that the U2AF dimer binds to 86%–88% of 3′ splice sites genome-wide (61, 62). From the wide array of affected genes, a number of specific downstream targets have been examined that are differentially spliced in the context of mutant U2AF1 and produce phenotypic effects. These include genes such as ATG7, H2AFY, STRAP, and IRAK4 (63–65). Interestingly, while wild-type U2AF1 is absolutely essential, expression of mutant U2AF1 S34F was not found to be required for continued proliferation and survival in vitro, arguing against a mutant-specific “addiction” (63, 66) and in favor of the “death by a thousand cuts” model (see above).
In further departure from canonical models, could U2AF1 mutants act on RNA in a splicing-independent manner or facilitate the acquisition of additional genetic alterations at the DNA level? Indeed, U2AF1 has been found to bind mRNAs in the cytoplasm and act as a regulator of translation, with the S34F mutation resulting in an overall increase in translation (67, 68). Furthermore, expression of mutant U2AF1 has been shown to induce DNA damage, originating from increased levels of reactive oxygen species (63) or increased formation of R-loops (like EWS-FLI1) (69, 70). As ATR is critical for the cellular response to excessive R-loops, mutations in U2AF1 also increase sensitivity to ATR inhibition (70), offering a promising therapeutic strategy for SF-mutant cancers. However, it remains unclear how elevated levels of DNA damage are involved in the development of myelodysplastic syndrome (MDS) and secondary AML, as these diseases typically exhibit low overall mutation burdens (71).
SF3B1. The SF3B1 gene encodes the largest subunit of the SF3B complex, involved in recognition of the branch point sequence (BPS) by the U2 small nuclear ribonucleoprotein (snRNP) component of the spliceosome (72). It is mostly affected by heterozygous missense mutations that are localized in the so-called HEAT (Huntingtin, EF3, PP2A, and TOR1) repeat domains, which are thought to mediate interactions with other members of the SF3B complex and the U2 snRNP (73). There is a tissue-type specificity to the hotspots, with K700 most often altered in MDS and R625 most often altered in uveal and other types of melanoma (Supplemental Figure 2B); however, the biological basis of this difference is unknown. All major SF3B1 mutations appear to act similarly to disrupt normal BPS recognition, resulting in utilization of alternative BPSs and splicing to cryptic 3′ sites located a short distance upstream of the canonical 3′ splice site (74–76). While the precise molecular mechanism remains incompletely understood, mutations may affect protein-protein interactions with other members of the SF3B complex or components of the spliceosome. Indeed, recent evidence indicates that SF3B1 mutations disrupt interactions with the spliceosomal protein SUGP1, and SUGP1 depletion is sufficient to phenocopy the splicing defects induced by SF3B1 mutants (77). Interestingly, recent analyses suggest that while the splicing errors caused by different hotspot mutations are qualitatively similar and impact the same genes, they may differ in the magnitude of their effect (60).
What are the functional consequences of SF3B1 mutations and the resulting loss of splicing fidelity? As might be expected, the use of unnatural splice sites frequently disrupts the reading frame of affected transcripts by generating premature termination codons and causes downregulation via nonsense-mediated decay (75). Several studies have identified such downregulated genes in SF3B1 mutant cells that may be required to suppress tumorigenesis, such as BRD9 and MAP3K7 (78, 79). However, continued expression of mutant SF3B1 appears to be dispensable, as targeted degradation of the mutant allele does not affect growth of cells in vitro (80). This result suggests that SF3B1 mutations, like U2AF1 mutations, may play a more important role in tumor initiation rather than tumor maintenance. Could SF3B1 mutations then promote cancer by increasing rates of mutagenesis, as has been proposed for U2AF1 mutations? In support of such a hypothesis, SF3B1 mutant cells do exhibit elevated levels of DNA damage, defects in DNA damage responses, and increased formation of DNA damage–prone structures such as R-loops (81–83). Aside from its canonical role in splicing, SF3B1 is also involved in several additional functions like 3′ end processing of histone pre-mRNAs and PRC1-mediated repression (84). The impact of SF3B1 mutations on these less understood roles and their relevance to cancer remains unknown.
SRSF2. The gene SRSF2 encodes a member of the serine/arginine–rich (SR-rich) protein family, which promote exon inclusion by binding to exonic splicing enhancer sequences (85, 86). Alterations in SRSF2 are almost exclusively heterozygous missense mutations at the P95 hotspot, located adjacent to its single RRM domain (Supplemental Figure 2B). SRSF2 binds to an SSNG consensus motif (where S = C or G) with an unusual ability to accommodate both C- and G-rich versions of the motif (87). As with U2AF1 and SF3B1, hotspot mutations in SRSF2 do not appear to be strictly loss-of-function alterations, as phenotypes of Srsf2 P95H heterozygous mice are distinct from phenotypes of Srsf2 heterozygous knockout mice (88). Furthermore, relative to wild-type Srsf2, P95 mutants have higher affinity for CCNG motifs and reduced affinity for GGNG motifs, leading to enhanced splicing of CCNG motif–containing exons and repressed splicing of GGNG motif–containing exons (88). For example, Ezh2 contains a poison exon that is included more in Srsf2 mutant cells, leading to EZH2 protein downregulation. Overexpression of EZH2 partially rescued the defect in colony formation of Srsf2 mutant hematopoietic stem cells, suggesting that it represents a functionally relevant downstream target (88). However, the splicing change in EZH2 was much weaker in a different context (89), and there are likely additional targets responsible for Srsf2 mutant phenotypes. In what has emerged as a common theme surrounding SF mutations, SRSF2 mutants have also been shown to increase DNA damage via elevated levels of R-loops (69, 70). In an interesting twist, however, mutant SRSF2 is proposed to induce R-loops not via its altered splicing activity but via its splicing-independent role in transcription (90) and transcriptional pause release (91).
ZRSR2. The ZRSR2 gene encodes a factor essential for the splicing of minor U12-type introns (92). In humans, U12-type introns are only found in a subset of 700–800 genes and have distinct features, such as highly conserved 5′ splice sites and BPSs, as compared with the much more prevalent U2-type introns (93). Mutations in ZRSR2 consist predominantly of loss-of-function alterations (94, 95) (Supplemental Figure 2B) and are associated with retention of minor U12-type introns, with consistent changes observed across different patient cohorts. In comparison, retention of major U2-type introns in ZRSR2 mutant samples is much more variable, suggesting secondary effects from perturbations to other SFs or heterogeneous effects from the specific type of ZRSR2 mutation present (96, 97). Interestingly, not all U12-type introns are equally affected by loss of ZRSR2, with a bias for retention of introns containing specific features, such as branch points more proximal to the 3′ splice site (96, 98).
A recent study combining ZRSR2-regulated splicing analysis with genetic screens converged on LZTR1, a regulator of RAS-related GTPases. LZTR1 contains a U12-type intron and is downregulated in response to ZRSR2 loss as a result of increased intron retention. Importantly, depletion of LZTR1 reverted the self-renewal capacity of ZRSR2-knockout hematopoietic stem cells back to wild-type levels (96). But the question remains: why do cells not select for mutations in LZTR1 in the development of MDS or leukemia? LZTR1 is mutated in other cancer types like glioblastoma and also in the RASopathy known as Noonan syndrome (99, 100), suggesting that it is not less prone to genetic alterations. The possibility remains that loss of ZRSR2 provides additional fitness benefits, perhaps through effects on genes in addition to LZTR1, that in combination make it a more potent driver event in MDS and leukemia.
RBM10. The RBM10 gene encodes a ubiquitously expressed regulator of alternative splicing with several domains known to interact with RNA. As with ZRSR2, mutations in RBM10 are frequently loss-of-function alterations, characteristic of a classic TS gene (Supplemental Figure 2B). Multiple CLIP-Seq studies indicate that RBM10 binds to introns at both 5′ and 3′ splice sites, with a greater enrichment at 3′ splice sites (101, 102). Although the precise mechanism is still unknown, RBM10 appears to primarily mediate skipping of cassette exons, with loss of RBM10 then resulting in aberrant exon inclusion (101, 102). Additional analyses indicate that RBM10 loss also correlates with upregulation of genes that normally show intron retention under wild-type conditions, suggesting that RBM10-mediated splicing regulation may also act to control gene expression levels (51). Overall, among the hundreds of splicing changes observed in RBM10 mutant samples, it remains unclear which are critical for tumor suppression. While several genes have been nominated as potential RBM10 targets of relevance (103), they have yet to converge on a consistent pathway or mechanism. Nevertheless, data from mouse models of lung cancer have confidently validated Rbm10’s TS role in vivo (104–106), and further studies will begin to reveal the detailed mechanisms involved. In addition to binding to protein-coding transcripts, RBM10 was also reported to bind some microRNAs (107), a class of noncoding RNAs discussed below.
Despite their small size (about 21–22 nucleotides), microRNAs (miRs) profoundly affect cellular transcriptomes and proteomes in worms (108, 109) and humans (110) alike. They typically act by pairing with and degrading or inactivating select mRNAs (111). In that capacity they can serve as either oncogenes (112, 113) or TS genes (114) or act in key oncogenic pathways (115). The intricate biogenesis of miRs involves two class III endoribonucleases, DROSHA and DICER, and one accessory protein, DGCR8 (116). All three genes are annotated in the COSMIC database (Figure 2); however, only DICER1 mutations are considered proven drivers (Table 1 and Supplemental Figure 2C).
DICER1. Mutations of DICER1 associated with cancer were first described as heterozygous germline alterations in patients with familial pleuropulmonary blastoma (PPB), a rare pediatric lung tumor. In most PPB families analyzed, frameshift mutations preceded the RNase III domains, but no loss of heterozygosity was observed (117). Subsequently, recurrent DICER1 mutations, mostly somatic but some germline, were described in various germ cell–derived tumors (118). Most of them were heterozygous missense mutations mapping to the metal-binding amino acids within the RNase IIIb domain, usually the D1709 residue (Supplemental Figure 2C). Predictably, in in vitro reactions, these mutants were defective in processing post-DROSHA double-stranded substrates (“pre-miRs”) into 22-nucleotide single-stranded mature species, and also there was a greater bias for the “passenger” strand at the expense of the canonical “guide” strand, which typically performs gene silencing functions.
Subsequent cancer profiling studies identified similar mutations (and companion genetic alterations in DROSHA and DGCR8) in renal Wilms tumors (119, 120), with well-documented detrimental effects on miR biogenesis. These and subsequent papers specifically demonstrated decreased levels of tumor-suppressive/oncogene-targeting miRs such as let-7 family members in both DICER1- and Microprocessor-mutant Wilms tumors (121, 122). Similar genetic lesions have now been found in many other cancer types (123).
The frequent retention of the wild-type DICER1 allele informed the concept that DICER1 is a haploinsufficient TS whose biallelic loss would make cells nonviable. This is in good agreement with genetically engineered mouse models (GEMMs) of cancer, where deletion of one copy of the gene — but not both! — was found to accelerate tumorigenesis (124). It also agrees with the common observation that, as a class, miRs are downregulated in cancers (125, 126).
In addition to this LoFi/“death by a thousand cuts” model, it has been proposed that the existence of a recurrent hotspot mutation is more consistent with a more targeted “death by a smoking gun” mechanism, wherein, as a result of natural selection, mutant DICER would preferentially undermine tumor-suppressive miRs and possibly other DICER-dependent small RNAs (123). While this model makes intuitive sense, at present there is limited experimental support for it. To complicate the matter, DICER is also implicated in non-small-RNA-based processes such as the DNA damage response (127). However, the fact that both DICER1 and DROSHA/DGCR8 are mutated in the same tumor types argues against the importance of non-miR mechanisms. On balance, the most parsimonious explanation is that depletion of miRs and ensuing overproduction of translatable mRNA trigger neoplastic transformation according to the LoFi scenario.
Both mRNAs and miRs function mainly in the cytosol. To get there they rely, respectively, on the TREX/NFX1 and exportin-5 systems (128). On the other hand, long noncoding (lnc), small nuclear (sn) and nucleolar (sno), and ribosomal (r) RNAs rely instead on the exportin-1 protein (XPO1) (129), which is also involved in the nuclear-cytoplasmic shuttling of up to 1000 proteins (130, 131), including nucleophosmin-1 (NPM1), a key player in the regulation of rRNA biogenesis. It is XPO1 and NPM1, bona fide cancer drivers (Table 1 and Supplemental Figure 2D), that attest to the importance of RNA transport for neoplastic transformation, although conclusive data remain scarce.
XPO1. The protein originally dubbed CRM1 (for chromosomal maintenance 1), but now commonly known as exportin-1, is encoded by XPO1 (132). Evidence implicating XPO1 in direct RNA binding is very limited, one notable example being trimethylguanine-capped U3 snoRNA (133). Mutations in the XPO1 gene, specifically the E571K hotspot mutation (Supplemental Figure 2D), were first reported in primary mediastinal B cell and Hodgkin lymphomas and some other B cell malignancies (134), at the same time that genetic and pharmacological targeting of XPO1 proved to have anticancer effects in a GEMM of lung cancer (135). Subsequent mechanistic studies focused largely on the protein cargoes of XPO1 and led to the identification of a significant number of nuclear export signal–bearing (NES-bearing) proteins redistributed between the nucleus and the cytoplasm in E571K mutant versus wild-type cells (136). Based on these data, the authors concluded that the E571K mutation alters, rather than abolishes, recognition of NES in favor of cargoes with negatively charged C-termini; however, another XPO1 mutation, D624G, appears to impair nuclear export overall. Interestingly, certain proteins highly relevant to cancer, such as the p53 TS, were retained in the nucleus both upon XPO1 chemical inhibition and as a result of the E571K mutation, suggesting the genetic impairment of XPO1 function. Unfortunately, very little is known at this point about the impact of these mutations on export of RNAs or, for that matter, RBPs like NPM1.
NPM1. Nucleophosmin-1, encoded by NPM1, has several nonoverlapping functions, many of which have to do with regulation of the p53 pathway and genomic stability, but its RNA-binding activity is thought to be associated mainly with ribosome biogenesis and nuclear export of rRNAs (137). It is quite frequently mutated in standard-risk AML, with the majority of mutations mapping to the C-terminal domain encoded by exon 12, thought to be responsible for RNA binding (138) (Supplemental Figure 2D). Most of these mutations are frameshifts, resulting in a protein isoform with a distinct C-terminal tail (Val-Ser-Leu-Arg-Lys). This amino acid sequence constitutes an additional NES, resulting in predominantly cytoplasmic NPM1 (139); predictably, this redistribution was later shown to be dependent on XPO1 activity (140). Given the multitude of its functions, it is difficult to attribute the effects of NPM1 mutation to a particular pathway or process. Absent such data, its possible effects on ribosome abundance and mRNA translation remain a distinct possibility. In support of this notion, several cancer drivers from the RBP family function in protein synthesis.
As the final stage of the gene expression program, mRNA translation serves as a convergent point at which the many steps of RNA processing collectively determine the amount of protein that is produced. Here, we highlight two factors that have recurrent mutations in cancer (Table 1 and Supplemental Figure 2E) and regulate mRNA translation in complex ways.
DDX3X. The DDX3X gene encodes an ATP-dependent DEAD-box RNA helicase that plays a role in nearly all steps of RNA metabolism (141). Genetic alterations in DDX3X consist of missense and loss-of-function mutations. Missense mutations occur mainly in the conserved helicase core (Supplemental Figure 2E), made up of two RecA-like domains that mediate ATPase and RNA-binding activity (141); and studies in both yeast and mammalian cells indicate that mutations associated with medulloblastoma are essentially loss-of-function alleles with impaired enzymatic activity (142–145). Cellular effects of DDX3X mutants vary and likely depend on context, with globally impaired translation in some cases (146) and more transcript-specific impaired translation in other cases (147).
DDX3X has a well-described role as an activator of translation for mRNAs with long and structured 5′-UTRs (148). Consistent with this function, the growth defects seen across a large set of DDX3X mutants in yeast correlated best with defects in translation of structured 5′-UTR–containing mRNAs, rather than with global levels of translation (142). Although several studies have recently mapped the specific transcripts bound and regulated by DDX3X (146, 147, 149), which target genes are especially critical for suppressing or driving cancer remains unclear. Mutations in DDX3X are particularly frequent in the Wnt and Shh subgroups of medulloblastoma, and intriguingly, expression of DDX3X mutants potentiated Wnt pathway signaling (150). In mouse models of Wnt- and Shh-driven medulloblastoma, Ddx3x knockout also increased disease penetrance and reduced tumor latency (151). Future studies will undoubtedly reveal more about the mechanistic link between DDX3X’s role in mRNA translation and its function as a TS.
EIF1AX. The gene EIF1AX encodes eukaryotic translation initiation factor 1A (eIF1A), a key initiation factor that stimulates assembly of the preinitiation complex and scanning of the mRNA for the AUG start codon (152, 153). EIF1AX alterations consist mainly of substitutions clustered in the first 15 amino acids of the N-terminal tail or a recurrent splice site mutation in the C-terminal tail that leads to usage of a cryptic splice acceptor and an in-frame deletion (Supplemental Figure 2E). Structural studies of the yeast and mammalian preinitiation complexes indicate that residues in the N-terminal tail of eIF1A are in contact with both the mRNA start codon and the initiator transfer RNA anticodon, allowing eIF1A to sense correct codon-anticodon pairing (154, 155). The N-terminal tail of eIF1A also interacts with the +4 mRNA position adjacent to the start codon, providing additional contextual sensing during scanning of the mRNA (154). Although there are likely some mutation-specific effects, substitutions in the N-terminal tail of eIF1A as a whole appear to alter start site recognition during translation initiation. In yeast, various N-terminal mutants of eIF1A exhibit greater discrimination against near-cognate UUG start codons and against cognate AUG start codons in “weaker” sequence contexts, resulting in reduced initiation at genes possessing such suboptimal start sites (156).
For the C-terminal A113splice variant observed in thyroid cancer, the precise effect on start site recognition is less well characterized. Interestingly, its expression results in enhanced translation of ATF4, which is typically repressed owing to preferential initiation at upstream open reading frames (157). The elevated levels of ATF4 in the context of mutant EIF1AX are associated with activation of mTOR signaling, enhanced MYC stability, and an overall increase in protein synthesis (157). Furthermore, mutations in EIF1AX commonly co-occur with RAS mutations, and both N-terminal missense mutants and the C-terminal A113splice variant increased transformation efficiency in the context of oncogenic RAS (157, 158), suggesting that the impact of EIF1AX mutants on translation may provide conditions especially favorable for RAS-induced oncogenesis.
Even a cursory survey of the literature and existing data sets indicates that most mutations in RBP genes are either deep deletions or frameshift mutations, which result in loss of expression and function. Based on these observations one could conclude that RBPs as a class perform tumor-suppressive roles. This does not come as a surprise when considering loss-of-function mutations, but hotspot missense mutations are typically viewed as gain-of-function events. However, a recent experimental study using the easy-CLIP technique yielded surprisingly few examples of RBPs whose RNA-binding activity was enhanced by cancer-specific missense mutations, and those examples are not known to be cancer drivers (159). Instead, it appears that in the case of RBPs such as DICER1 and SRSF2 (and perhaps U2AF1 and SF3B1 as well), hotspot missense mutations do not result merely in impaired functions typical of loss-of-function mutations, but rather result in the LoFi phenotype (Figure 3).
Consequences of LoFi versus loss-of-function phenotypes. Examples of cancer-driving mutant RBPs with roles in mRNA splicing and microRNA biogenesis and associated molecular events. Low fidelity (LoFi) refers to atypical functions of RBPs with hotspot missense mutations, and loss of function refers to diminished RBP functions due to heterozygous frameshift mutation or monoallelic deletions. Homozygous losses of RBP genes are often lethal.
Why would RBPs be tumor suppressive, and why would loss-of-function or LoFi events contribute to cancer? Several distinct scenarios might be in play: (a) For proteins with dual affinity for DNA and RNA (of which EWS is but one example), loss of RNA binding could unmask oncogenic activity of their transactivation domains and allow them to partner with canonical transcription factors. (b) A variation on this RNA-to-DNA theme is the well-documented involvement of many mutated splicing factors (as well as the “moonlighting” splicing factor EWS-FLI1) in the formation of single-stranded R-loops in the DNA, ensuing DNA damage, and acquisition of further cancer-driving mutations (160). (c) Another way LoFi versions of RBPs could contribute to oncogenesis is by increasing protein output, either globally or in a targeted way, to support the rapid increase in cell mass (161). Relevant mechanisms might include the dysregulated miR pathway, possibly nuclear export, and certainly translation itself.
Admittedly, in contrast to gain-of-function mutations in oncogenes where direct inhibition provides a straightforward and beneficial therapeutic strategy, targeting cancers with RBP loss-of-function and LoFi variants is a more challenging proposition. Still, there could be targetable vulnerabilities related to sustained DNA damage (162) [scenario (b) above] or the unfolded protein response, one direct consequence of increased translation (163) [scenario (c) above]. Furthermore, some of the newest cancer therapeutics, FDA-approved or under clinical development, are being used in manners informed by mutations in RBP genes. The following two examples illustrate this point.
First, cancers with imprecisely functioning spliceosomes can be successfully targeted by direct spliceosome inhibitors such as H3B-8800 (164) or inhibition of enzymes that regulate spliceosome factors via posttranslational modifications, for example, PRMT5 (165) and CLK (166) (reviewed in ref. 50). Notably, both the PRMT5 inhibitor GSK3326595 and the CLK inhibitor SM08502 as well as H3B-8800 are currently in clinical trials for various types of cancer (for example NCT04676516, NCT03355066, and NCT04676516; ClinicalTrials.gov).
Second, in late 2020, the FDA granted approval for the XPO1 inhibitor selinexor (in combination with bortezomib and dexamethasone) for the treatment of multiple myeloma. Although XPO1 mutations have not been found in multiple myeloma, genetic screens have identified it as an essential gene in this very aggressive plasma cell cancer (167). Selinexor is also in phase I/II clinical trial in patients with non-Hodgkin lymphomas (NCT03147885), some of which do accumulate XPO1 mutations. The outcome of this trial should determine whether these mutations serve as predictive biomarkers — and more broadly, whether mutations in the RBP genes can indeed be successfully targeted in the clinic.
Relevant research in our laboratories has been supported by NIH grants R00 CA208028 (to PSC), R01 CA196299 (to ATT), and U01 CA232563 (to ATT). We thank many members of the RNA Society–sponsored RNA Salon at the University of Pennsylvania for stimulating discussions. We apologize to all authors whose important and relevant papers we were unable to discuss owing to space limitations.
Address correspondence to: Andrei Thomas-Tikhonenko, Children’s Hospital of Philadelphia, 4056 Colket Translational Research Bldg., 3501 Civic Center Blvd., Philadelphia, PA 19104, USA. Email: andreit@pennmedicine.upenn.edu.
Conflict of interest: ATT receives funding from Pfizer’s ASPIRE Program for research on alternative splicing in cancer.
Copyright: © 2021, American Society for Clinical Investigation.
Reference information: J Clin Invest. 2021;131(18):e151627.https://doi.org/10.1172/JCI151627.