Dear Editor,

During the past years, considerable efforts have been made to uncover the genetic component of chronic lymphocytic leukemia (CLL) susceptibility. To date, several genome-wide association studies (GWAS) and their meta-analysis have identified not only single-nucleotide polymorphisms (SNPs) associated with CLL risk [1] but also patient survival [2]. However, despite these noticeable results, it becomes evident that both validation and functional characterization of the genetic variations identified are still required before they can be used in a clinical setting. Hence, we decided to validate the association of 41 GWAS-identified hits for CLL in 1158 CLL cases and 1947 controls ascertained through the Consortium for Research in Chronic lymphocytIc Leukemia (CRuCIAL) and to investigate their impact on modulating host immune responses and their utility to predict disease onset. Study participants were of European ancestry and gave their written informed consent to participate in the study, which was approved by the ethical review committee of participant institutions. CLL patients had often Binet stage A and Rai stage I (67.00% and 79.83%) and, compared to controls, had a higher mean age (66.19 ± 12.66 vs. 55.60 ± 11.50) and an increased male/female ratio (1.54 vs. 0.91). SNPs selection was based on published GWAS, functionality according to HaploReg data, and linkage disequilibrium between the SNPs. Genotyping of genetic variants was performed using KASPTM and Taqman® assays. Hardy–Weinberg equilibrium was assessed in the controls (P > 0.001) and the association between CLL and SNPs was tested using a multivariate unconditional logistic regression analysis adjusted for age, sex, and country of origin. A meta-analysis of the CRuCIAL results with those from previous GWAS was conducted to validate genetic associations and the I [2] statistic was used to assess statistical heterogeneity between the studies (PHet > 0.01). The pooled OR was computed using the fixed-effect model and the significance threshold for the meta-analysis was set to 5.0 × 10−8. Mechanistically, we evaluated the correlation of the GWAS-identified SNPs with a production of nine cytokines after in vitro stimulation of whole blood, peripheral mononuclear cells, and monocyte-derived macrophages from 408 healthy subjects of the Human Functional Genomic Project (HFGP) with LPS, PHA, Pam3Cys, CpG and Borrelia burgdorferi and Escherichia coli. In parallel, we also tested the correlation between selected SNPs and circulating concentrations of 103 serum and plasmatic inflammatory proteins, 7 plasma steroid hormones, and absolute numbers of 91 blood-derived immune cell populations. The HFGP study was approved by the Arnhem-Nijmegen Ethical Committee (42561.091.12) and biological specimens were collected after informed consent was obtained. A detailed description of the study population and participating centers, selected SNPs and protocols and reagents used in the functional experiments are included in the Supplementary Material available on the Blood Cancer Journal website. In order to account for multiple comparisons, we used a significance threshold of 2.3 × 10−5, 1.2 × 10−5, 1.34 × 10−5, and 1.74 × 10−4 for the cytokine quantitative trait loci, proteomic, blood cell counts, and steroid hormone analyses, respectively.

Logistic regression analyses confirmed the association of 21 SNPs with CLL risk at P < 0.05 level in the CRuCIAL cohort. The strongest association was found for SNPs located in the GRAMD1B locus (P = 6.2 × 10−16 and 6.0 × 10−4) that was further validated through meta-analysis (Table 1). The GRAMD1B locus (11q24.1) encodes for a transporter mediating the non-vesicular transport of cholesterol from the plasma membrane to the endoplasmic reticulum. Our experiments revealed that carriers of the GRAMD1Brs35923643G allele had increased numbers of transitional CD24+CD38+ B cells (P = 4.25 × 10−5; Fig. 1A), which have an IL10-dependent immunosuppressive effect on pro-inflammatory responses against cancer cells. We also found that carriers of the GRAMD1Brs35923643G allele had increased serum concentrations of IL18R1 (P = 0.00085; Fig. 1B), a receptor found to be dysregulated in CLL and that contributes to tumor escape from the immune system [3]. In support of the association of the GRAMD1Brs35923643 SNP with CLL risk, we found that this genetic variant is located among histone marks for primary B cells and it determines altered motifs for PU1, MEF2A, POU2F2, NKFB, OCT2 and IRF4, which is linked to CLL onset [1]. Moreover, we observed that carriers of the GRAMD1Brs2953196G allele had decreased circulating concentrations of SIRT2 and ADA (P = 0.00037 and 0.00079; Fig. 1C, D). SIRT2 is overexpressed in primary CLL cells and plays a key role in determining cell survival [4]. Recent studies have shown that increased serum levels of SIRT2 were associated with longer overall survival [5] whereas SIRT2 inhibitors induced cell death in leukemic cell lines [6]. Similarly, ADA, an enzyme of the purine metabolism related to lymphoid T cell differentiation and tumor cellular responses, has been found to be overexpressed in CLL patients and correlates with longer survival [7]. Another study showed that blockade of A2A adenosine receptors made CLL cells more susceptible to pharmacological treatments while restoring immune competence and T cell proliferation [8]. Serra and coworkers also showed that activation of the ADO receptors inhibited chemotaxis and limited drug-induced apoptosis of CLL cells [9]. Finally, we found that carriers of the GRAMD1Brs2953196G allele had decreased serum concentrations of STAMBP protein (P = 0.00033; Fig. 1E), a key protein involved in the control of autophagy flux and the NLRP3 inflammasome. These results suggest that the GRAMD1B locus might exert its biological function on CLL by modulating SIRT2, STAMBP, and ADA, which is a diagnostic biomarker for CLL that has been included in a new prognosis score designed to optimize the patient risk stratification [7].

Table 1 Validation of GWAS-identified variants for CLL.
Fig. 1: Functional characterization of GWAS-identified variants for CLL (A–M) and receiver operating characteristic (ROC) curve analysis (N).
figure 1

Correlation between functional data and GWAS-identified SNPs was evaluated by linear regression analysis adjusted for age and sex. ROC curve summarizes the accuracy of prediction for each particular model. The model including SNPs significantly associated with the risk of developing CLL and demographic variables (marked in blue) showed a significantly improved predictive capacity compared with a reference model including only age and gender as covariates (marked in red). AUC = 0.809 vs. AUC = 0.765; N = 2123 subjects; LR test = 2.2 × 10−16.

Besides these findings, the meta-analysis confirmed the association of 29 additional SNPs with the risk of developing the disease (ORMeta = 1.15–1.71; Table 1), which suggested a functional role of these markers in modulating CLL risk. In this regard, our experiments revealed that carriers of the IRF8rs391855A allele showed increased numbers of class-switched CD27-IgMIgD memory B cells (P = 3.39 × 10−5; Fig. 1F) and central memory CD4+CD45RACD27+ T cells (P = 0.0001; Fig. 1G), whereas carriers of the CXXC1rs1036935A allele had decreased numbers of CD19+CD20+ B cells (P = 0.00075; Fig. 1H), a subset of cells poorly expressed in CLL patients [10]. The IRF8 locus encodes for a transcription factor exclusively expressed in immune cells that regulate B cell-activating factor (BAFF)-mediated B cell activation, cell survival, adaptative NK cell responses, and CD8/CD4 T cell differentiation. In line with these findings, we also found that carriers of the PRKD2rs11083846A allele showed decreased numbers of transitional CD24+ CD38+ B cells (P = 0.00046; Fig. 1I), whereas carriers of the ILRUNrs3800461C allele had decreased levels of HLA-DR+ T regulatory and conventional CD4+ T cells. Finally, we also observed that carriers of the POU5F1P2rs2511714G allele showed increased numbers of CD8+ effector memory (CD45RACD27) T cells. The POU5F1P2rs2511714G SNP is located among histone marks in primary B cells whereas the PRKD2rs11083846 SNP is an eQTL for the PRKD2 gene in whole blood but also SLC1A5, CALM3, and FKRP genes that have been associated with CLL onset [11]. We hypothesize that the IRF8, CXX1, ILRUN, and POU5F1P|ODF1 loci might influence CLL risk by modulating specific subsets of B and T cells and regulatory T cells that play critical roles in the pathogenesis of the disease [12] and influence prognosis. In fact, it is known that peripheral regulatory T cell populations expressing CD4+ in CLL are associated with disease progression and exhibit a prognostic value [13]. In addition, we found a correlation between the TERTrs7705526A allele and decreased serum concentrations of TRAIL and TWEAK (P = 5.23 × 10−5 and 0.0001; Fig. 1J, K), which are involved in the regulation of key cell functions including immune responses, inflammation, proliferation, differentiation, and apoptosis. These results are in agreement with those showing that CLL patients exhibit reduced serum TRAIL both before and after treatment [14] and that its aberrant expression in CLL promotes cell survival [15]. Similarly, we found a correlation of the TSBP1-AS1rs926070G allele with decreased concentrations of IL12 and TWEAK proteins (P = 0.00023 and 0.00050; Fig. 1L, M), which reinforced the idea of an implication of TWEAK and TWEAK-mediated immune responses in CLL. In support of this finding, it has been reported that TWEAK attenuates the transition from innate to adaptive immunity, which might affect blood cell populations, immune responses, and, consequently, influence the susceptibility to CLL. On the other hand, we found that carriers of the ILRUNrs3800461C allele had decreased numbers of conventional CD4+ T cells and HLA-DR+ T regulatory cells (P = 0.00041 and 0.00058), whereas carriers of the POU5F1P2rs2511714G allele showed increased numbers of CD8+ effector memory CD45RA-CD27- cells (P = 0.00053; Supplementary Material). No functional effect for the rest of SNPs was observed.

Considering the number of variants that showed significant associations with CLL risk, we attempted to establish the clinical usefulness of genetic biomarkers in predicting disease onset by using a double approach that consisted of building a predictive model using demographic variables and SNPs significantly associated with CLL risk and weighted and unweighted polygenic risk scores (PRSs; Supplementary Material). The area under the curve (AUC) of a receiver operating characteristic curve analysis and −2 log-likelihood ratio (LR) tests showed that a model including age, sex, and 16 SNPs significantly improved the ability to predict the onset of the disease when compared with the reference model including only demographic variables (AUC = 0.809 vs AUC = 0.765; PLRtest = 2.2 × 10−16; Fig. 1N). We also computed weighted and unweighted PRSs in a subset of 806 CLL cases and 1417 controls from the CRuCIAL cohort and we found an OR = 6.81, 95% CI 4.65–9.96, P = 2.0 × 10−21 for the highest vs. lowest quintile of the unweighted score and OR = 10.45, 95% CI 6.96–15.70, P = 2.0 × 10−27 for the highest vs. lowest quintile of the weighted score. Strong associations were also observed when weighted scores were built using ORs from the original GWASs. The best AUC value was observed for the weighted score computed in the CRuCIAL cohort (AUC = 0.68, 95% CI 0.65–0.70).

In conclusion, this study confirmed the association of 31 GWAS-identified SNPs with CLL risk and shed some light on the function of some of these biomarkers in the modulation of TReg, B, and T cell differentiation and proliferation, blood concentrations of B cell-related proteins, cell survival, and the expression of immune- and non-immune-related loci. Though outside the scope of the current study, it is important to mention that additional functional studies using blood samples from CLL patients are still required to validate our findings and to decipher the exact biological mechanisms behind the observed associations. A potential limitation of this work was the relatively small population size of the CRuCIAL cohort that hampered the validation of the SNPs showing modest associations.