Background

Often considered one disease, sporadic colorectal cancer (CRC), accounting for 95% of CRC cases, is a heterogeneous disease arising from different sets of genetic and epigenetic alterations.1 The most established underlying molecular pathological subtypes of CRC are characterised by microsatellite instability (MSI) (prevalence 15% in sporadic CRC), CpG island methylator phenotype (CIMP-high, 20%),2 B-Raf proto-oncogene serine/threonine kinase gene mutations (BRAF mutations, 10%) and Kirsten rat sarcoma viral oncogene homologue gene mutations (KRAS mutations, 30–50% of sporadic CRC cases). KRAS and BRAF mutations are considered mutually exclusive.3 Another important molecular feature is the adenomatous polyposis coli (APC) gene, a key tumour suppressor gene mutated in 45% to 81% of sporadic CRC cases.2

Smoking is associated with increased risk of CRC.4 Moderate to high intake of alcohol was shown to increase CRC risk in a linear dose–response association.5 Several studies found smoking was associated with a higher risk of MSI-high,6,7,8,9,10 CIMP-high,8,10,11 BRAF-mutated (mut)8,10,11 and KRAS-wild-type (wt) CRC12,13 while others found no differential association by CRC subtypes.13,14,15,16 In one study, alcohol was found17 to be associated with increased risk of MSI-high compared to MSS CRC. In other previous studies, alcohol was not differentially associated with either MSI,9,18 CIMP, BRAF19,20,21 or KRAS status.22

Since smoking and high alcohol consumption are often correlated, both risk factors were investigated in this study. The aim of this study was to extend current knowledge on the associations of smoking and alcohol consumption with major molecular subtypes and pathways of CRC.

Methods

Study population

The DACHS study (Darmkrebs: Chancen der Verhütung durch Screening; CRC: chances for prevention through screening), an ongoing case–control study with follow-up of CRC cases, was initiated in 2003 and has been described in detail previously.23,24 In short, cases with a first, histologically confirmed, diagnosis of CRC and randomly selected control participants with no history of CRC, frequency matched to cases by age, sex and county of residence, are recruited in the Rhine-Neckar-Odenwald region in Germany (~2 million inhabitants). The DACHS study was approved by the ethics committees of the Medical Faculty of Heidelberg University and the state medical boards of Baden-Wuerttemberg and Rhineland-Palatinate.

Data collection

Eligible cases were identified in hospitals within the study region and after giving written informed consent were interviewed by trained interviewers using a standardised questionnaire during hospitalisation after surgery or at home after discharge. The median time between diagnosis and interview was 24 days (interquartile range: 10–224 days). Control participants were randomly selected from population registries and contacted through the study centre to schedule home interviews. Controls with a history of CRC were excluded. Controls opting out of the interview were offered a self-administered short questionnaire. Based on hospital data, ~50% of eligible patients were recruited. The participation rate of eligible controls was 51%.

The current analysis is based on DACHS participants recruited in 2003–2010, as comprehensive molecular tumour analyses of MSI, CIMP, BRAF and KRAS were performed in full for cases joining the study in that period. Participants reporting having Crohn’s disease or ulcerative colitis (N = 28) were excluded from the current analysis. Control participants who answered the short questionnaire only (N = 658) were also excluded due to missing required information on lifetime alcohol consumption and time of smoking cessation (Supplementary. Fig. 1).

Assessment of smoking and alcohol consumption

Participants were interviewed regarding smoking history prior to diagnosis (for cases) or interview (controls). Participants were classified as non-smokers if they had never smoked regularly or as former smokers if they had stopped smoking at least 2 years before diagnosis (cases) or interview (controls).

Participants were asked about alcohol consumption in each decade of life from the age of 20 years until diagnosis (cases) or interview (controls). Alcohol consumption was calculated in units of gram ethanol per day. Data were collected on participants’ drinking habits of portions of beer (0.33 l), wine (0.25 l) and liquor (0.02 l). Ethanol content was derived from food composition tables,25 assuming an average of 4, 8.6 and 33 g of pure ethanol in 100 ml of beer, wine and liquor, respectively.

Tumour tissue analyses

Details of tumour tissue analyses of MSI, BRAF, KRAS and CIMP have been reported previously.26 In short, formalin-fixed, paraffin-embedded surgical specimens of CRC tumours were collected from cooperating pathology institutes and transferred to the tissue bank at the National Centre for Tumour Diseases (NCT) in Heidelberg. MSI analysis was performed using a mononucleotide marker panel (BAT25, BAT26 and CAT25), which differentiates MSI-high from non-MSI-high tumours with a sensitivity of 98.2% and a specificity of 100%, and with 100% concordance of MSI-high tumours compared with the National Cancer Institute/International Collaborative Group on HNPCC marker panel (BAT25, BAT26, D17S250, D2S123 and D5S346) for the evaluation of MSI in CRC.27,28,29

For KRAS, in about half of the tumour samples, mutation status was determined by a single-stranded conformational polymorphism technique using the same DNA sample, and expression of BRAF V600E was determined by immunohistochemical analyses by two pathologists independently (91% concordance, κ 0.59). Discordant cases were discussed to obtain a final evaluation. In the other half of the tumour samples, KRAS mutation status and BRAF mutation status were determined by Sanger sequencing as reported previously.30

CIMP was determined after DNA bisulfite conversion as previously described.31 CIMP-high and CIMP-low/negative were classified when 3–5 and 0–2 of the investigated loci (MGMT, MLH1, MINT1, MINT2, and MINT31) had a positive methylation status, respectively.

Statistical analyses

Multinomial logistic regression models were used to estimate adjusted odds ratios (ORs) and 95% confidence intervals (CIs) for the association of smoking and alcohol consumption with CRC risk according to molecular features and pathways. The models were adjusted for covariates known to be associated with CRC risk in all regression analyses: sex, age, body mass index (BMI) 5–14 years before diagnosis (cases) or interview (controls), education level, using non-steroidal anti-inflammatory drugs (NSAIDs) regularly for more than a year (yes, no), history of CRC in a first-degree family member, previous large bowel endoscopy and diabetes. Additionally, average lifetime daily ethanol consumption was included as a covariate in the smoking analyses and smoking (never, former, current) as a covariate in the alcohol consumption analyses. Ever, former and current smoking were compared to never smoking. High alcohol consumption was defined as the fourth quartile of the average daily lifetime gram ethanol consumption among alcohol drinkers (>24.6 g) and was compared in analyses to low/never consumption (≤24.6 g).

In case–control analyses, each of the molecular features or pathways was compared to all study controls. In addition, combinations of single tumour markers approximating the traditional (MSS, CIMP-low/negative, BRAF-wt, KRAS-wt), sessile serrated (CIMP-high, BRAF-mut) and alternate (MSS, CIMP-low/negative, KRAS-mut) pathways to the development of CRC were examined.32 To assess heterogeneity in CRC risk between subtypes and pathways, case–case analyses were conducted with the same covariates as in the case–control analyses. All statistical tests were two sided and the significance level (α) was <0.05. Analyses were conducted using R version 3.4.4.33

Results

A total of 4919 participants, 2444 cases and 2475 controls, were included in the current analysis. Descriptive statistics for study participants are shown in Table 1. Current smoking was associated with a 59% (OR = 1.59, 95% CI: 1.30–1.94) increased risk of CRC, while former smoking was associated with a 19% (OR = 1.19, 95% CI: 1.03–1.38) increased risk. However, risk was not increased if smoking cessation was more than 20 years ago. More than 29 pack years of smoking were associated with 61% increased CRC risk (OR = 1.61, 95% CI: 1.31–1.99) compared to never smoking. High alcohol consumption (>24.6 g/day) was associated with increased CRC risk (OR = 1.27, 95% CI: 1.08–1.50) (Supplementary Table 1).

Table 1 Characteristics of study population.

Smoking and CRC risk by molecular pathological subtypes

Current smoking compared to never smoking showed much higher odd-ratios for MSI-high (OR = 2.79, 95% CI: 1.86–4.18) compared to MSS CRC (OR = 1.41, 95% CI: 1.14–1.75, p-heterogeneity = 0.001), for BRAF-mut (OR = 2.40, 95% CI: 1.41–4.07) compared to BRAF-wt CRC (OR = 1.52, 95% CI: 1.24–1.88, p-het = 0.074), for KRAS-wt (OR = 1.70, 95% CI: 1.36–2.13) compared to KRAS-mut CRC (OR = 1.26, 95% CI: 0.95–1.68, p-het=0.039) and for CIMP-high CRC (OR = 2.01, 95% CI: 1.40–2.88) compared to CIMP-low/negative CRC (OR = 1.50, 95% CI: 1.22–1.85, p-het = 0.101), although not all differences in association were statistically significant at the p < 0.05 level. The results and differences were not as strong for the comparison of ever vs. never smoking (Table 2 and Fig. 1).

Table 2 Association of smoking and alcohol consumption with CRC risk by single molecular pathological subtypes.
Fig. 1: Association of current smoking with CRC risk overall and by molecular pathological subtypes and pathways.
figure 1

Heterogeneity between molecular subtypes was assessed in case-case comparison. For the pathways, heterogeneity was assessed using the traditional pathway as reference.

Smoking and CRC risk by molecular pathological pathways

Ever and current smoking were significantly associated with higher risk of CRCs that were grouped into the traditional (OR = 1.50, 95% CI: 1.16–1.94) or the serrated pathways (OR = 2.39, 95% CI: 1.27–4.52) (Table 3 and Fig. 1). On the other hand, no association was found between smoking and CRC developing along the alternate pathway, characterised by MSS, CIMP-low/negative and KRAS mutation (OR = 1.08, 95% CI: 0.77–1.52, p-het = 0.062).

Table 3 Association of smoking and alcohol consumption with CRC risk by molecular pathological pathways.

Alcohol consumption and CRC risk by molecular pathological subtypes and pathways

Average lifetime daily consumption of more than 24.6 g ethanol was associated with around 30% increased CRC risk of the non-aberrant, more frequent subtypes (MSS, BRAF-wt, KRAS-wt, CIMP-low/neg) and not with the aberrant subtypes (MSI, BRAF-mut, CIMP-high), but no statistically significant differences were observed in heterogeneity testing between subtypes in this study. The strength of the associations of high alcohol consumption with risk of traditional pathway CRC and the direction of association of the serrated pathway CRC were comparable to that of current smoking, but heterogeneity was not statistically significant (Tables 2 and 3 and Fig. 2).

Fig. 2: Association of high alcohol consumption with CRC risk overall and by molecular pathological subtypes and pathways.
figure 2

Heterogeneity between molecular subtypes was assessed in case-case comparison. For the pathways, heterogeneity was assessed using the traditional pathway as reference.

Joint effects of smoking and alcohol

No interaction was found between high lifetime average daily alcohol consumption and ever smoking (p-interaction = 0.519) in the association with CRC risk, and no major differences were found when analysing alcohol consumption (high vs. low/none) stratified by smoking status (ever/never) and CRC risk for the different molecular subtypes or pathways (data not shown).

Discussion

This large population-based case–control study aimed to examine the association between smoking and alcohol consumption and CRC risk by CRC subtypes and pathways characterised by MSI, BRAF mutation, KRAS mutation and CIMP status. Ever and current smoking were associated with higher risk for CRC, especially with MSI-high, BRAF-mut, KRAS-wt and CIMP-high CRC. Smoking was also associated with higher risk of cancers developing via the traditional or the serrated pathways. High alcohol consumption was not differently associated with single CRC subtypes or pathways, but associations with molecular pathways still seemed similar to those of current smoking.

Our findings are generally in agreement with former studies. Smoking was associated with higher MSI-high compared to MSS CRC risk in five previous studies6,7,8,9,10 and in a meta-analysis published in 2018.34 Stronger associations were also previously reported for BRAF-mut compared to BRAF-wt CRC,8,10,11 for KRAS-wt compared to KRAS-mut CRC12,13,14 and for CIMP-high compared to CIMP-low/negative CRC.10,11 Smoking has also been found to be associated with the serrated-polyps pathway, defined by CIMP-high and BRAF-mut status.35,36,37 Further, in accordance with previous studies, no major or statistically significant differences were found in the associations between alcohol consumption and CRC risk by molecular pathological subtypes,9,17,18,19,22 although the observed associations with CRC pathways pointed to potential differences in our study.

Although several possible biological mechanisms were proposed,10,38 the way smoking increases the risk of MSI-high CRC is still not established. Our results also support a strong link between smoking and BRAF mutation, which is regarded as the initiating event in sessile serrated adenomas, followed by methylation of key tumour suppressor genes, which would also be supported by the stronger association with CIMP-high CRC and the sessile serrated pathway.32,39

Molecular pathological epidemiology (MPE) focuses on heterogenic aetiology of CRC based on molecular tumour features. While studies linking smoking with CRC risk found an overall increase of around 26% in risk,40,41 our study provides more specific risk estimates by known CRC subtypes and pathways. This distinction can help provide more evidence for the causal relationship and its mechanisms between smoking and CRC risk.

The large size of the study, its population-based design, the comprehensive assessment of smoking, alcohol consumption and other lifestyle, medical and family history factors, and the analysis of multiple major molecular tumour tissue markers, are notable strengths of this study. To the best of our knowledge, this is the first study examining the effect of both smoking and alcohol consumption on CRC risk by major molecular pathological tumour features and pathways of CRC. This study adds to the limited knowledge about the potential smoking-related increase in CRC risk according to molecular features, potentially linking smoking differentially to specific molecular pathways.

The study also has limitations. MPE is a relatively new field of research, thus analyses are often exploratory and need confirmation from other studies.1 Not all patients with available tumour tissue samples could be included in the subtype analyses: MSI status was available for 90% of cases, BRAF for 92%, KRAS for 92% and CIMP for 99% of cases. Another limitation is that some analysed subgroups of cases were small, which lowered the ability of the study to reach significant results, particularly when multiple tumour features were combined. As this is an observational study, based on self-reports during standardised interviews, smoking, alcohol and other relevant factors may be subject to information bias.

In summary, based on results from this large population-based case–control study, smoking, and in particular current smoking, showed the strongest association with increased risk of molecular subtypes of CRC MSI-high and KRAS-wt and with CRC showing features of the sessile serrated pathway. No major differences were observed for the association of alcohol with subtypes, but potential differences according to pathways should be investigated in future studies. More large studies with tumour marker combinations are needed to confirm these results for a better characterisation of the carcinogenic mechanisms underlying these associations.