当前位置: X-MOL 学术Am. J. Hematol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Landscape and clinical significance of long noncoding RNAs involved in multiple myeloma expressed fusion transcripts
American Journal of Hematology ( IF 10.1 ) Pub Date : 2021-12-27 , DOI: 10.1002/ajh.26450
Ane Amundarain 1, 2 , Luis V Valcárcel 1, 3 , Raquel Ordoñez 1, 2 , Leire Garate 1, 2, 4 , Estíbaliz Miranda 1, 2 , Xabier Cendoya 3 , Arantxa Carrasco-Leon 1, 2 , María José Calasanz 2, 5 , Bruno Paiva 2, 4, 5, 6 , Cem Meydan 7, 8, 9 , Christopher E Mason 7, 8, 9 , Ari Melnick 7 , Paula Rodriguez-Otero 2, 4 , José I Martín-Subero 2, 10, 11, 12 , Jesús San Miguel 2, 4 , Francisco J Planes 3 , Felipe Prósper 1, 2, 4 , Xabier Agirre 1, 2
Affiliation  

Multiple myeloma (MM) is a hematologic neoplasm characterized by a clonal expansion of malignant plasma cells (PCs) in the bone marrow, showing clinical, genetic, and epigenetic heterogeneity. Chromosomal translocations are one of the hallmarks of MM, and mainly involve the immunoglobulin heavy chain locus (IGH). These translocations usually result in the placement of various oncogenes under the control of IGH, leading to the up-regulation of genes that provide a selective growth advantage to MM cells.1 Five recurrent IGH translocations have been described in MM; however, in many cases, the second gene involved is not defined in routine clinical analyses. Besides, recent studies have reported novel recurrent fusion partners and novel non-IGH fusions beyond well-known translocations.2, 3 Nevertheless, these approaches did not consider the normal counterpart of B-cells, which may provide new insights regarding the role of fusion transcripts (FT) in MM. Furthermore, MM is also associated with deregulation of long noncoding RNAs (lncRNA), a group of genes with increasing relevance in cancer.4 Various studies suggest the involvement of lncRNAs in chromosomal translocations; however, this has not been assessed in MM.

Here, to define the landscape of expressed FTs in MM, we analyzed the strand-specific RNA-seq (ssRNA-seq) data of 35 samples obtained from 6 different B-cell subpopulations (5 naïve, 7 centroblast, 7 centrocyte, 8 memory, 5 tonsillar PC, and 3 bone marrow [BM] PC samples) obtained from 11 healthy donors (8 tonsil and 3 BMPCs) and PCs from 37 MM patients, paying particular attention to FTs involving lncRNAs (lncFT). Using the STAR-Fusion algorithm, we initially identified 2169 FTs. After applying several computational filtering steps, we defined 1454 FTs expressed in B-cells and MM samples (Figure S1). The highest numbers of FTs were detected in healthy donor PCs (tonsillar plasma cells [TPC] and BMPC) (Figure S2A-B), and based on the biological relevance of IG genes in B-cells and malignant PC, detected FTs were classified into IG and REST (none of the associated genes corresponded to an IG gene) categories (Figure S2C–E). The 82.5% of FTs detected in healthy PCs occurred with IG genes, harbored very few reads per transcript, and were only supported by junction reads without any spanning reads covering the non-IG partner gene. Therefore, FTs that were not supported by at least one spanning read were filtered out, resulting in the final detection of 208 expressed FTs in normal B-cells and MM cells (Figure S1A). To validate our results and identify FTs consistently detected, we also applied ARRIBA and STAR-SEQR algorithms to our cohort with the same filters described above. One hundred and fifty-eight FTs were detected by at least two algorithms and were selected for further analyses after a quality check step (Appendix S1; Figures S1A, S2F–I; Table S1). These expressed FTs were detected in every cell population, with a significantly higher number of FTs in MM cells (median of 3 ± 2.97) (Wilcoxon p-value <.001) (Figure S2G). A similar number of reads per FT was detected in all cell subpopulations (analysis of variance [ANOVA] p-value .382), suggesting that FTs are expressed consistently at low levels (Figure S2G), and most of the expressed FTs occurred between two non-IG partners (Figure S2H). Characteristic features of B-cells include IG gene rearrangement and active transcription of IG genes, leading to the transcription of thousands of similar transcripts from these loci,5 which could be misidentified as FTs in these cells. Thus, cell-specific features should be considered when implementing an adequate FT detection pipeline for each cell type to exclude false positive events. Furthermore, the presence of FTs in normal B-cells indicates that FTs are not exclusive to tumor cells, suggesting that FTs may contribute to transcriptional diversity in healthy tissues.

From the 158 FTs, we filtered out those detected in at least one normal B-cell sample to focus on MM-specific FTs, leading to the identification of 79 expressed FTs (61 unique) (Figure S1A, Table S2), 29.5% of which had not been previously described (Figure 1A). At least one expressed FT was identified in 75.7% of the MM samples (Figure 1B), indicating that some FTs emerge specifically after malignant transformation. A Human Phenotype ontology analysis of coding genes involved in these 61 unique FTs showed a significant enrichment of genes associated with B lymphocyte dysfunction phenotypes (p-adj <.05), suggesting that important genes for B-cell abnormalities and MM pathogenesis may be more prone to FT formation. Most of the MM-expressed FTs showed an overall low read count, with some exceptions (Figure 1C). As previously described,2 85.3% of MM FTs were patient-specific, but 9 were recurrently expressed FTs (Figure 1D). A total of 88.5% of MM-specific FTs were derived from the fusion between 2 non-IG partners and IG FT percentages were lower than those previously reported2, 3 probably due to our smaller cohort size (Figure 1E). Nevertheless, we identified the IGH-NSD2 expressed FT derived from t(4;14) in two patients and two MM cell lines (Figure S3A–C), and described a novel FT between the genes GBE1 and KIF20B in one MM cell line (Figure S3D–F). Recent studies have shown the implication of lncFTs in MM, such as lncFTs with PVT1, but a complete characterization of the lncFT transcriptome is still pending.3 We observed that the 27.9% of MM-specific expressed FTs were lncFTs (Figure 1F), some of them leading to the overexpression of the associated lncRNA (Figure 1G,H), as in the case of FTs involving oncogenes.2, 3 Interestingly, a relevant fraction of MM-specific FTs occurred between two adjacent genes in the same DNA strand, being defined as transcription read-throughs (RT) (Figure 1I), a novel class of MM-specific FT. A total of 64.3% of RTs involved a lncRNA as a fusion partner gene (Figure 1J), some of them were detected in both MM patient samples and cell lines with a low expression and were validated in cell lines through real-time quantitative reverse transcription PCR (qRT-PCR) and Sanger sequencing (Figure S3G–I). Furthermore, we found recurrences for three RTs with lncRNAs, such as the FT between AC092691.1 and LSAMP (Figure 1K), showing an increased expression of the AC092691.1 in comparison to normal PCs and MM samples without the FT (Figure 1L). The presence of functional oncogenic lncFTs and RTs has been reported for other tumor types,6 suggesting that lncFTs could be important in MM, but additional studies will be needed to determine their role in MM.

Details are in the caption following the image
FIGURE 1
Open in figure viewerPowerPoint
Expressed lncFTs detected in multiple myeloma (MM) patient samples. (A) Distribution of annotated and non-annotated FT detected in MM patient samples. FusionHub, FusionGDB, and CTAT HumanFusionLib database libraries were used for FT annotation. (B) Median of FT expressed per MM patient sample. We included the 79 MM-specific FT (61 unique). (C) Read count of 79 FT expressed in MM patient samples. (D) Recurrences among FT detected in MM patient samples. (E) Distribution of MM FT among transcripts with IG genes (IGH, IGL) or without IG genes (REST). (F) Distribution of MM FT per biotype. FT was classified between those involving lncRNAs (lncFT) and those where lncRNAs are not present. (G,H) Expression of AC090578.1 and AC109630.1 lncRNA fusion partners in MM patients showing the FT with these lncRNAs (green) or without these FT (pink). (I) Distribution of MM-specific unique FT per chromosomal location. RT was defined as FT occurring between two adjacent genes located in the same DNA strand. (J) Distribution of MM-specific unique RTs per biotype. (K) Recurrence of RTs among MM patient samples. (L) Expression of the AC092691.1 lncRNA in MM patients showing the FT with this lncRNA (green) or without this FT (pink). Gene expression has been computed in TPM (transcripts per million). (M) Forest plot to summarize the effect of the variables (ISS stage, TP53 mutations, and the expression of 3 lncFT (TEX35-AL37796.1, AL050309.1-KLF8, and PVT1-IGL) for PFS in the final model selected by BIC. The points represent the expected hazard ratio and the horizontal bar the 95% confidence interval. p-values for each variable are in the right column of the plot. (N) Kaplan–Meier curves showing the combined study of expression of 3 lncFT (TEX35-AL37796.1, AL050309.1-KLF8, and PVT1-IGL) together with TP53 mutations and ISS stage for PFS of MM patients. All risk groups harbor patients with lncFT. (O) Forest plot to summarize the effect of the variables (ISS stage, Del 17p, Amp 1q, and expression of lncFT TEX35-AL37796.1) for OS in the final model selected by BIC. The point represents the expected hazard ratio and the horizontal bar, the 95% confidence interval. p-values for each variable are in the right column of the plot. (P) Kaplan–Meier curves showing the combined study of expression of lncFT (TEX35-AL37796.1) together with Del 17p, Amp 1q, and ISS stage for OS of MM patients. All risk groups harbor patients with lncFT. All the analyses were performed using the CoMMpass data set IA15 release, using patient data at diagnosis. FT, fusion transcripts; IGH, FT with IGH genes; IGK, FT with IGK genes; IGL, FT with IGL genes; REST, FT without IG genes

Finally, we analyzed whether lncFTs might have an impact on the outcome of MM patients by analyzing expressed FTs in 599 MM patients included in the MMRF CoMMpass data set release IA15. We used the intersection of STAR-Fusion and ARRIBA, identifying 556 expressed lncFTs. Interestingly, we found that 35% of MM-specific unique lncFTs defined in our cohort were present in the CoMMpass data set. We observed various FTs between lncRNAs and IG genes (IGK-FAM230C, IGH-LINC-PINT), suggesting that MM patients with IG translocations could involve both coding and noncoding partner genes, and that lncRNAs could explain some of the MM cases in which the associated IG gene in translocations remained unknown. We validated the robustness of our algorithm by comparing the number of patients in which we detected the IGH-NSD2 FT with those patients in which t(4;14) was detected by whole-genome sequencing (WGS), identifying IGH-NSD2 FT in 75 of the 79 samples positive by WGS, and additionally, detecting the expression of this FT in 2 other MM samples where WGS for t(4;14) was negative (Fisher's exact test p-value = 6.7e-88). To assess whether the lncFTs could be associated with prognosis in MM, we selected those lncFTs that were detected in more than 2% of MM patients (Figure S4A), and we evaluated the combination of lncFTs and the defined high-risk genetic markers1 (International Staging System [ISS] stage, t(4;14), t(14;16), t(14;20), del(17p), deletion of CDKN2C, del(1p), amp(1q), and mutations of TP53) using a multivariate coxph model and BIC to select the optimal number of variables. We discovered that the expression of 3 lncFTs (TEX35-AL37796.1, AL050309.1-KLF8, and PVT1-IGL), together with the ISS stage and TP53 mutations resulted in a significantly lower progression-free survival (PFS) (global p < .0001) and the model stratifies the MM patients according to the number of events they have (an event consists of having any of the five risk factors) into four risk groups (Figure 1M,N). Similarly, expression of 1 lncFT (TEX35-AL37796.1) together with the ISS stage, del(17p) or amp(1q) also resulted in statistically significant worse overall survival (OS) (global p < .0001), identifying five groups with significant differences in their OS (Figure 1O,P). An ANOVA test comparing the models derived from high-risk genetic factors only or combining them with lncFTs resulted in a significant improvement for the combination of both risk factors for PFS (p-value = 6.3e−5, Figure S4B) and OS (p-value = .019, Figure S4C). These findings should be validated in other MM cohorts, but our results suggest that lncFTs in MM could contribute to a better patient stratification, impacting in patient management in terms of treatment choice or contributing to the identification of specific subgroups of patients suitable for personalized therapies.

In summary, this study provides the first comprehensive landscape of expressed lncFTs and RTs in MM, demonstrating that FTs may also be expressed in normal B-cells and that expression of recurrent lncFTs may have a significant impact in PFS and OS in MM patients.



中文翻译:

参与多发性骨髓瘤表达的融合转录本的长链非编码 RNA 的景观和临床意义

多发性骨髓瘤 (MM) 是一种血液肿瘤,其特征是骨髓中恶性浆细胞 (PC) 的克隆性扩增,表现出临床、遗传和表观遗传的异质性。染色体易位是MM的标志之一,主要涉及免疫球蛋白重链基因座(IGH)。这些易位通常导致将各种癌基因置于IGH的控制之下,从而导致为 MM 细胞提供选择性生长优势的基因上调。1五复发性IGH易位已在 MM 中描述;然而,在许多情况下,涉及的第二个基因在常规临床分析中没有定义。此外,最近的研究报告了除了众所周知的易位之外的新的复发性融合伙伴和新的非IGH融合。2, 3然而,这些方法没有考虑 B 细胞的正常对应物,这可能为融合转录物 (FT) 在 MM 中的作用提供新的见解。此外,MM 还与长链非编码 RNA (lncRNA) 的失调有关,lncRNA 是一组与癌症相关性越来越高的基因。4各种研究表明 lncRNA 参与了染色体易位;但是,这尚未在 MM 中进行评估。

在这里,为了定义 MM 中表达 FT 的情况,我们分析了从 6 个不同 B 细胞亚群(5 个幼稚、7 个中心母细胞、7 个中心细胞、8 个记忆)获得的 35 个样本的链特异性 RNA-seq (ssRNA-seq) 数据、5 个扁桃体 PC 和 3 个骨髓 [BM] PC 样本)从 11 名健康供体(8 个扁桃体和 3 个 BMPC)和来自 37 名 MM 患者的 PC 中获得,特别注意涉及 lncRNA (lncFT) 的 FT。使用 STAR-Fusion 算法,我们最初识别了 2169 个 FT。在应用几个计算过滤步骤后,我们定义了 1454 个以 B 细胞和 MM 样本表示的 FT(图 S1)。在健康供体 PC(扁桃体浆细胞 [TPC] 和 BMPC)中检测到的 FT 数量最多(图 S2A-B),并且基于IG的生物学相关性B 细胞和恶性 PC 中的基因,检测到的 FT 分为IG和 REST(相关基因均不对应于IG基因)类别(图 S2C-E)。在健康 PC 中检测到的 82.5% 的 FT 发生在IG基因中,每个转录本的读数非常少,并且仅由连接读数支持,没有任何跨越非IG的读数伴侣基因。因此,至少一个跨越读取不支持的 FT 被过滤掉,最终在正常 B 细胞和 MM 细胞中检测到 208 个表达的 FT(图 S1A)。为了验证我们的结果并确定始终检测到的 FT,我们还使用上述相同的过滤器将 ARRIBA 和 STAR-SEQR 算法应用于我们的队列。至少有两种算法检测到 158 个 FT,并在质量检查步骤后选择用于进一步分析(附录 S1;图 S1A,S2F-I;表 S1)。在每个细胞群中都检测到了这些表达的 FT,MM 细胞中的 FT 数量显着增加(中位数为 3 ± 2.97)(Wilcoxon p-值 <.001)(图 S2G)。在所有细胞亚群中检测到每个 FT 的读数相似数量(方差分析 [ANOVA] p值 .382),表明 FT 始终以低水平表达(图 S2G),并且大多数表达的 FT 发生在两个非IG合作伙伴(图 S2H)。B 细胞的特征包括IG基因重排和IG基因的活跃转录,导致从这些基因座转录数千个相似的转录物,5这可能被错误地识别为这些单元格中的 FT。因此,在为每种细胞类型实施适当的 FT 检测管道以排除假阳性事件时,应考虑细胞特定的特征。此外,正常 B 细胞中 FTs 的存在表明 FTs 不是肿瘤细胞独有的,这表明 FTs 可能有助于健康组织中的转录多样性。

从 158 个 FT 中,我们过滤掉在至少一个正常 B 细胞样本中检测到的那些,以专注于 MM 特异性 FT,从而识别出 79 个表达的 FT(61 个唯一)(图 S1A,表 S2),29.5%以前没有描述过 (图 1A)。在 75.7% 的 MM 样本中鉴定出至少一种表达的 FT(图 1B),表明一些 FT 在恶性转化后特异性出现。对涉及这 61 个独特 FT 的编码基因的人类表型本体分析显示,与 B 淋巴细胞功能障碍表型相关的基因显着富集(p-adj <.05),表明 B 细胞异常和 MM 发病机制的重要基因可能更容易形成 FT。大多数 MM 表达的 FT 显示总体读取计数较低,但有一些例外(图 1C)。如前所述,2 85.3% 的 MM FT 是患者特异性的,但 9 是反复表达的 FT(图 1D)。共有 88.5% 的 MM 特异性 FT 来自 2 个非IG合作伙伴之间的融合,并且IG FT 百分比低于先前报道的 2、3可能是由于我们的队列规模较小(图 1E)。尽管如此,我们还是确定了IGH-NSD2在两名患者和两个 MM 细胞系中表达源自 t(4;14) 的 FT(图 S3A-C),并描述了一个 MM 细胞系中基因GBE1KIF20B之间的新 FT (图 S3D-F)。最近的研究表明 lncFT 在 MM 中的意义,例如具有 PVT1 的lncFT,但 lncFT 转录组的完整表征仍然悬而未决。3我们观察到 27.9% 的 MM 特异性表达的 FT 是 lncFT(图 1F),其中一些导致相关 lncRNA 的过度表达(图 1G,H),就像涉及癌基因的 FT 一样。2、3有趣的是,MM 特异性 FT 的相关部分发生在同一 DNA 链中的两个相邻基因之间,被定义为转录通读 (RT) (图 1I),这是一类新的 MM 特异性 FT。共有 64.3% 的 RT 涉及作为融合伴侣基因的 lncRNA(图 1J),其中一些在 MM 患者样本和低表达细胞系中检测到,并通过实时定量逆转录在细胞系中进行了验证PCR (qRT-PCR) 和 Sanger 测序(图 S3G-I)。此外,我们发现了三个带有 lncRNA 的 RT 的复发,例如AC092691之间的 FT 。图1LSAMP(图 1K)显示了AC092691的表达增加。1与没有 FT 的普通 PC 和 MM 样本相比(图 1L)。据报道,其他肿瘤类型存在功能性致癌 lncFTs 和 RTs,6这表明 lncFTs 在 MM 中可能很重要,但需要更多的研究来确定它们在 MM 中的作用。

详细信息在图片后面的标题中
图1
在图形查看器中打开微软幻灯片软件
在多发性骨髓瘤 (MM) 患者样本中检测到表达的 lncFT。(A) 在 MM 患者样本中检测到的带注释和未注释 FT 的分布。FusionHub、FusionGDB 和 CTAT HumanFusionLib 数据库用于 FT 注释。(B) 每个 MM 患者样本表示的 FT 中位数。我们包括了 79 个 MM 特定的 FT(61 个唯一的)。(C) MM 患者样本中表示的 79 FT 的读取计数。(D) MM 患者样本中检测到的 FT 复发。(E) MM FT 在具有IG基因 ( IGH , IGL ) 或没有IG基因 (REST) 的转录本中的分布。(F) 每个生物型的 MM FT 分布。FT 分为涉及 lncRNA (lncFT) 和不存在 lncRNA 的那些。(G,H) AC090578的表达. 1AC109630MM 患者中的1 个lncRNA 融合伙伴显示具有这些 lncRNA(绿色)或没有这些 FT(粉红色)的 FT。(I) 每个染色体位置的 MM 特异性独特 FT 分布。RT 定义为位于同一 DNA 链中的两个相邻基因之间发生的 FT。(J) 每个生物型的 MM 特异性独特 RT 的分布。(K) MM 患者样本中 RT 的复发。(L) AC092691的表达。MM 患者中的1 个lncRNA 显示 FT 有这种 lncRNA(绿色)或没有这种 FT(粉红色)。基因表达已以 TPM(每百万转录本)计算。(M) 总结​​变量影响的森林图(ISS 阶段,TP53突变,以及BIC 选择的最终模型中 PFS的 3 个 lncFT(TEX35-AL37796 . 1AL050309 . 1-KLF8PVT1-IGL )的表达。点代表预期风险比,水平条代表 95% 置信区间。每个变量的 p 值位于图的右列。(N) Kaplan-Meier 曲线显示 3 lncFT ( TEX35-AL37796 . 1AL050309 . 1-KLF8PVT1-IGL ) 与TP53的表达联合研究MM 患者 PFS 的突变和 ISS 分期。所有风险群体都拥有 lncFT 患者。(O) 森林图总结了BIC 选择的最终模型中 OS的变量(ISS 阶段、Del 17p、Amp 1q 和 lncFT TEX35-AL37796 的表达。1 )的影响。点代表预期风险比,水平条代表 95% 置信区间。每个变量的p值在图的右列中。(P) Kaplan-Meier 曲线显示 lncFT ( TEX35-AL37796 . 1 ) 表达的联合研究) 以及 MM 患者 OS 的 Del 17p、Amp 1q 和 ISS 分期。所有风险群体都拥有 lncFT 患者。所有分析均使用 CoMMpass 数据集 IA15 版本进行,使用诊断时的患者数据。FT,融合转录本;IGH,具有 IGH 基因的 FT;具有IGK基因的IGK、FT;IGL,具有 IGL 基因的 FT;没有 IG 基因的 REST、FT

最后,我们通过分析 MMRF CoMMpass 数据集发布 IA15 中包含的 599 名 MM 患者中表达的 FT,分析了 lncFT 是否可能对 MM 患者的结果产生影响。我们使用了 STAR-Fusion 和 ARRIBA 的交集,识别了 556 个表达的 lncFT。有趣的是,我们发现在我们的队列中定义的 35% 的 MM 特定的独特 lncFT 存在于 CoMMpass 数据集中。我们观察到 lncRNA 和IG基因(IGK-FAM230CIGH-LINC-PINT)之间的各种 FT,这表明具有IG易位的 MM 患者可能同时涉及编码和非编码伴侣基因,并且 lncRNA 可以解释一些 MM 病例,其中关联IG易位中的基因仍然未知。我们通过比较我们检测到 IGH-NSD2 FT 的患者数量与通过全基因组测序 (WGS) 检测到 t(4;14) 的患者数量来验证我们算法的稳健,在79 个样本中的 75 个被 WGS 阳性,此外,在 t(4;14) 的 WGS 为阴性的其他 2 个 MM 样本中检测到该 FT 的表达(Fisher 精确检验 p 值 = 6.7e-88)。为了评估 lncFTs 是否与 MM 的预后相关,我们选择了在超过 2% 的 MM 患者中检测到的那些 lncFTs(图 S4A),我们评估了 lncFTs 和定义的高风险遗传标记1的组合(国际分期系统 [ISS] 阶段、t(4;14)、t(14;16)、t(14;20)、del(17p)、删除CDKN2C、del(1p)、amp(1q) 和TP53的突变)使用多变量coxph模型和 BIC 来选择最佳变量数。我们发现 3 个 lncFTs(TEX35 -AL37796.1 AL050309.1 - KLF8和PVT1 -IGL)的表达,连同 ISS 阶段和TP53突变导致无进展生存期(PFS)显着降低(全球p < .0001),该模型根据 MM 患者发生的事件数量(事件包括具有五个风险因素中的任何一个)将 MM 患者分层为四个风险组(图 1M,N)。类似地,1 lncFT ( TEX35-AL37796 . 1)的表达与 ISS 阶段、del(17p) 或 amp(1q) 也导致统计学上显着更差的总生存期 (OS) (全局p  < .0001),确定了五个组他们的操作系统存在显着差异(图1O,P)。ANOVA 检验比较仅源自高风险遗传因素的模型或将它们与 lncFT 相结合,导致 PFS(p值 = 6.3e-5,图 S4B)和 OS(p-值 = .019,图 S4C)。这些发现应该在其他 MM 队列中得到验证,但我们的结果表明,MM 中的 lncFT 可能有助于更好的患者分层,在治疗选择方面影响患者管理或有助于识别适合个性化治疗的特定患者亚组。

总之,本研究首次全面展示了 MM 中表达的 lncFTs 和 RTs,证明 FTs 也可能在正常 B 细胞中表达,并且复发 lncFTs 的表达可能对 MM 患者的 PFS 和 OS 产生重大影响。

更新日期:2022-02-10
down
wechat
bug