Background

Invasive micropapillary carcinomas (IMPCs) are classified by the World Health Organization classification of breast tumors as a subset of carcinomas with specific morphological features [1]. IMPCs were first described in 1980 by Fisher et al. [2]. Morphologically, IMPCs are characterized by small solid cell nests or rings of tumor cells, surrounded by empty stromal spaces mimicking lymphatic spaces or invasion in adipose tissue. Remarkably the tumor cells show a reversed polarity, meaning that the apical pole of the cells is oriented toward what usually is considered the basal pole of the cells [1]. The reversed polarity can be better demonstrated by immunohistochemistry with antibodies directed against epitopes of the epithelial membrane antigen (i.e., EMA or MUC1 stainings) which highlight the typical inside-out growth pattern [3]. The majority of IMPCs are estrogen receptor (ER) and/or progesterone receptor (PR)-positive. Amplification of epidermal growth factor receptor 2 (HER2) is seen in 10–30% of IMPC cases, while triple-negative phenotype is rarely encountered [1].

The reported incidence of pure IMPCs in literature is low, ranging from 0.9 to 2% of all breast cancers [1]. IMPCs have the reputation of having a more aggressive behavior, because they more often present with larger tumor sizes and have a higher frequency of blood vessel and lymphatic vessel invasion (LVI) and regional lymph node metastasis at initial presentation compared to invasive breast carcinoma of no special type [4,5,6,7]. However, whether IMPCs have a poorer prognosis as compared to breast carcinomas of no special type remains debated [8,9,10]. Additionally in the past years, the assessment of the stromal tumor infiltrating lymphocytes (sTILs) in the peritumoral stroma has reached level 1b evidence as useful prognostic marker especially in certain breast cancer subtypes [11, 12]. Interestingly, until now only one study reported on sTILs in IMPCs as a potential prognostic marker, unfortunately using a less established scoring method [13].

In this study we used a clinically well annotated and histologically well characterized retrospective series of IMPC to further asses the prognostic value of sTILs applying the novel scoring method as proposed by the International Immuno-Oncology study group. Additionally, we sought to evaluate by immunohistochemistry additional prognostic biomarkers that are usually linked to prognosis in luminal breast cancers.

Methods

Case selection and surrogate subtyping

The study was approved by the KU Leuven ethics committee (MP002896). The database of the multidisciplinary breast center of University Hospitals Leuven was retrospectively reviewed for all patients with IMPC diagnosis and stage I-III disease between January 2000 and December 2016 (n = 151). After exclusion of patients with bilateral and multifocal breast cancer, 111 patients were included in the study. Primary metastatic disease, neoadjuvant chemotherapy, insufficient material, missing HER2-status, treatment at another institution, prior breast cancer history and absence of IMPC diagnosis after review were exclusion criteria. Case selection, exclusion criteria and methodology are summarized in Fig. 1. The treatment method for all patients was determined on a rule-based method according to the institutional protocols.

Fig. 1
figure 1

Overview of case selection. The flow chart illustrates the cases initially selected from the database and those effectively included in the study for further analysis after exclusion. Starting from the Multidisciplinary Breast center database of the University Hospitals Leuven (n = 151) to the case selection for sTILs-assessment (n = 111) and immunohistochemical stainings on Tissue Micro Arrays (n = 105). NACT neoadjuvant chemotherapy, sTILs stromal tumor infiltrating lymphocytes, CNB core needle biopsy, IHC immunohistochemistry

All included cases were centrally revised by two independent pathologists (F.D., G.F.). Cases with a micropapillary component of more of 75% were considered as pure IMPC, other cases were considered as non-pure IMPC. In non-pure IMPCs, cases showing less than 5% of micropapillary component were excluded from further analysis. In over 85% (n = 97) of the 111 cases the immunostaining confirmation with EMA clone 1R629 from DAKO was available corroborating the diagnosis of IMPC.

Clinicopathological information including menopausal status, immunohistochemistry and/or FISH of predictive markers (ER, PR, HER2), tumor size, tumor grade and TNM-classification was retrieved from the database and the medical records. ER-, PR and HER2-status were defined based on the guidelines of the American Society of Clinical Oncology/College of American Pathologists [14, 15]. Because KI67 was not routinely performed, surrogate breast cancer subtype classification was applied in accordance with the classification of Brouckaert et al. [16]. Grade 1–2 and grade 3 ER-positive HER2-negative IMPC were considered luminal A-like and B-like, respectively. ER-positive tumors with HER2 amplification were referred to as Luminal B-like HER2+ . ER−, PR- and HER2-negative IMPC were classified as triple-negative breast cancers (TNBC); tumors lacking ER and PR nuclear expression with HER2 amplification were considered HER2+ . Data on the clinical outcome were not available during scoring of sTILs and immunostainings.

Assessment of tumor-associated inflammatory component on hematoxylin and eosin (H&E)

The sTILs levels were independently evaluated by two pathologists (F.D., G.F.) on H&E stained slides from the resection specimen of the 111 included patients. In 47 cases, the matching diagnostic core needle biopsy (CNB) was available for evaluation as well. The recommendations of the international Immuno-oncology working group in 2014 and 2017 were used for the scoring [12, 17]. In short, sTILs were recorded as the average of the % of tumor-associated stroma infiltrated by mononucleated inflammatory cells at the invasive tumoral front in three to five fields at × 200 magnification. The sTILs scores obtained from the resection specimens were used for further statistical analysis, either as continuous or categorical variables upon subdivision in three categories as follows: sTILs high (≥ 50%), sTILs intermediate (30–49%) and sTILs low (< 30%). We defined the categorical variables based on previously published definition of lymphocytic predominant breast cancer [18] and based on the exploratory prognostic sTILs cut-off proposed for TNBC [11], respectively, for the 50% and 30% cut-off.

Applying the same method, we also wanted to explore the difference in the geographical distribution of sTILs. In particular we studied the proportion of sTILs as a global average amount and in the hot spots, both in the invasive front and in the inner part of the tumor. The hot spot region was first chosen at scanning magnification and was selected by one pathologist (G.F.) as the region showing the highest proportion of sTILs, avoiding areas of lobulitis. The actual sTILs score was then assessed at × 200 magnification in the center of the selected region, which often also included nests of tumor cells. The difference between the invasive and central hot spot regions was calculated and reported as a delta value with three categories (Δ>0, Δ<0, and Δ0 representing, respectively, either higher sTILs level in the invasive front, higher sTILs level in the inner region or equal distribution). For further statistical analyses we focused on the hot spot values, while for the comparison of pure and non-pure IMPC also the spatial distribution of the global average was tested.

To further characterize the composition of the inflammatory infiltrate we further looked at the presence of tumor-associated plasma cells (TAPC), which are easily recognizable on H&E due to the peculiar morphological features. TAPC were measured according to a previously published method [19]. The patterns of distribution of TAPC at the invasive front were recorded by one pathologist (G.F.) and subdivided in four categories with increasing scale from 0 to 3 as follows: score 0 = no TAPC; score 1 = only scattered TAPC in absence of micro-cluster of 5 TAPC; score 2 = presence of at least one isolated micro-cluster of TAPC with 5 TAPC; score 3 = presence of confluent micro-clusters of TAPC.

Finally, all biopsies of the resection specimen were revised by one pathologist (F.D.) for assessment of presence or absence of tertiary lymphoid structure (TLS), considered as categorical variable in the analysis. A TLS was defined as any lymphoid aggregate with or without germinal center in the peritumoral area following the indications of the International Immuno-Oncology Working Group [12].

Tissue microarray (TMA) and immunohistochemistry (IHC)

First, all cases (n = 111) were reviewed by a single pathologist (F.D.) to confirm tissue availability and to mark the areas to be included into tissue microarrays (TMA). Only the zones with clear micropapillary pattern were selected for staining by immunohistochemistry. After review, 6 cases were excluded due to insufficient residual material on the paraffine block (Fig. 1). For each of the remaining 105 patients 3 cores with a diameter of 2 mm were placed into recipient paraffine block, using the TMA Grand Master (3DHistech Ltd., Sysmex Belgium NV). We then assessed by IHC the protein expression levels of six biomarkers in a total of eight TMA: Wilms’ tumor protein 1 (WT1), paired box gene 8 (PAX8), forkhead box P3 (FOXP3), B-cell lymphoma 2 (BCL2), tumor protein 53 (P53) and cluster of differentiation 8 to detect cytotoxic T-lymphocytes (CD8). All staining’s were scored by a single pathologist (F.D.). The details about antibody clones, detection method and scoring system are provided in Supplementary Table 1.

To analyze the amount of CD8-positive cytotoxic T-cells within the tumoral area, a CD8 staining was performed on whole tissue sections of all cases with high or intermediate sTILs (n = 24) and on the same amount of cases with low sTILs, that were randomly selected. In 23 of the 24 cases of high and intermediate sTILs and in 23 of the randomly 24 selected cases with low sTILs, there was enough tumor material for additional CD8 staining. The number of positive cells per area (mm2) and the percentage of CD8-positive cells in the total amount of sTILs was obtained using QuPath open source software [20]. One pathologist (F.D.) manually selected the invasive tumor front on digital slides. Lymphocytes, tumor cells and the subset of CD8-positive lymphocytes were detected using the QuPath semi-automated algorithms as described by Berben et al. [21]. The density of intraepithelial CD8-positive cells was also measured and annotated as the number of positive intraepithelial cells per area (mm2) with or without correction for the proportion of tumor cells nests (Supplementary Table 2).

Statistical analyses

The Pearson correlation coefficient (Pearsons r) was calculated to assess the interobserver variance and the correlation of sTILs scores in CNB and resection specimens. The average sTILs value on resection specimens of the two pathologists was used for the further analysis and Bland–Altman plot was used to exclude methodology bias between the two pathologists [22].

The comparison of groups on categorical variables was performed by the Fisher exact test. The Mann–Whitney U test was used for comparing two groups, or the Kruskal–Wallis test for continuous variables for more than two groups. We used the Spearman’s rank correlation coefficient to calculate the association between continuous variables and patient characteristics. The comparison of the presence of CD8-positive T-cells within the tumor area between the two selected groups was done using an unpaired T-test.

The cumulative incidence function (CIF) was used for estimating distant recurrence free interval (DRFI) and breast cancer-specific survival (BCSS) rates, accounting for death of other causes as competing event [23]. The association between selected variables and outcome was analyzed using the Fine and Gray model and results reported as hazard ratios with 95% confidence intervals [24]. When, due to the absence of events in one of the groups, no HR could be estimated based on the Fine and Gray model, the Pepe and Mori test was used for comparing outcome between groups.

Analyses were performed using Excel (2016) and SAS software (version 9.4 of the SAS System for Windows).

All tests were two-sided, significance level was set at p ≤ 0.05. Given the exploratory nature of the study, no correction for multiplicity was performed.

Results

Clinicopathological features, sTILs, TAPC, TLS and biomarker expression by IHC in pure versus non-pure IMPCs

All patients were female with a median age at diagnosis of 61.5 years (range 33–88 years). The standard clinicopathological parameters, surrogate subtypes and results of immunohistochemical staining’s, sTILs, TAPC and TLS for the whole cohort and the pure and non-pure IMPC subgroups separately, are provided in Table 1. Approximately 59% (n = 65) of cases showed pure IMPCs histology. Half of the cases were poorly differentiated (n = 56), about 10% of the cases showed tumors larger than five cm (n = 11) and almost two third of the patients presented at least one positive lymph node (n = 61). LVI was observed in 49 cases. Over 80% of cases were classified either as luminal A-like (n = 51) or as luminal B-like (n = 41); HER2 amplification was recorded in 15% of the cases (n = 17), while only two cases were TNBC. We recorded the aberrant P53 nuclear expression in 10 cases only, no cytoplasmatic or null pattern was observed. The presence of a ductal carcinoma in situ (DCIS) component and FOXP3 as categorical variable were statistically significantly more associated with non-pure IMPCs; no other differences were observed when we compared pure versus non-pure IMPCs (Table 1).

Table 1 Clinicopathological features, surrogate subtypes and results of immunohistochemical stainings

We measured the average amount of peritumoral stromal infiltration by monocytic inflammatory cells. We found only a moderate correlation between the two pathologists (r = 0.62; p < 0.001) on the core needle biopsies (CNB); while in the resection specimen the sTILs score showed a strong correlation (r = 0.74; p < 0.001). The correlation between sTILs scores on CNB and resection specimen was moderate for each pathologist (r = 0.58; r = 0.64; p < 0.001, data not shown). The Bland–Altman plot showed only a marginal and statistically non-significant bias between the two observers (Supplementary Fig. 1). Overall, the mean sTILs level was 20%. Both pure and non-pure IMPC showed a higher proportion of sTILs at the invasive front. Remarkably, the zones with micropapillary differentiation showed scarce intervening stroma with often rare non-inflammatory stromal cells. The distribution of the hot spot region at the invasive front and in the inner region of the tumors showed similar results. There were no significant differences between pure and non-pure IMPC in sTILs levels (both as continuous and categorical variable), patterns of TAPC or presence of TLS (Table 1).

Association of sTILs, TAPC, TLS and biomarker expression by IHC with surrogate subtypes

The association between the sTILs, TAPC, TLS and biomarker expression by IHC with surrogate subtypes is shown in Table 2. We observed a significant association between the surrogate molecular subtype and aberrant P53- and BCL2 expression. For sTILs (both as continuous and categorical variable) and FOXP3, a trend was observed. We additionally explored the association between sTILs, pattern of TAPC, TLS and biomarker expression by IHC with the pairwise comparisons of surrogate molecular subtypes (Supplementary Table 3). We observed that BCL2 was strongly related to all luminal subtypes as compared to the non-luminal ones, while P53 was especially related to the ER-negative/HER2 + IMPCs as compared to luminal surrogate subtypes including luminal B-like HER2. The statistical analysis performed on the pairwise comparison showed significantly higher sTILs (as a categorical variable) in Her2-positive IMPCs as compared to the luminal A-like subtype. The distribution patterns of TAPC within the tumoral inflammatory infiltrate were not associated with surrogate molecular subtypes. Presence of TLS was related to HER2 + IMPC as compared to luminal A- or luminal B-like IMPC.

Table 2 Association between sTILs, TAPC, TLS and biomarker expression by IHC and surrogate subtypes

Distant recurrence free interval and breast cancer-specific survival in pure and non-pure IMPCs and relation with sTILS, TAPC, TLS and biomarker expression by IHC.

The associations between sTILs (global average, hot spot analysis and their spatial distributions), TAPC, TLS, selected immunohistochemical markers and DRFI or BCSS using the Fine and Gray univariate analysis model are shown in Tables 3 and 4. After a median follow-up of 100 months, 7 distant relapses (6%) and 3 breast cancer-related deaths (3%) were observed. There was a significant difference in DRFI with distant relapse events observed only in patients with non-pure IMPCs (n = 7; p = 0.005). The hazard ratio could not be estimated due to the absence of events in pure IMPCs. Regarding BCSS, we observed a trend toward worse BCSS in non-pure IMPCs compared to pure IMPCs (p = 0.070). Kaplan–Meier curves for DRFI and BCSS in pure and non-pure IMPCs are shown in Fig. 2. Negative WT1 expression and positive FOXP3 expression were also significantly associated with worse DRFI, but not with BCSS (Fig. 2). Additionally, we observed that sTILs as a continuous variable was associated both with worse DRFI and worse BCSS (respectively, HR 1.55 [1.08;2.21] and HR 2.1 [1.4;3.14], both p < 0.02). As a categorical variable, high versus low sTILs was prognostic for worse DRFI (HR 7.03 [1.45;34.13], p = 0.02) with a trend toward worse BCSS. For the categorical global test, a trend toward worse DRFI was observed.

Table 3 Association between DRFI and sTILs, TAPC, TLS and biomarker expression by IHC
Table 4 Association between BCSS and sTILs, TAPC, TLS and biomarker expression by IHC
Fig. 2
figure 2

Kaplan–Meier curves of DRFI and BCSS for pure and non-pure IMPC, Kaplan–Meier curves of DRFI according to FOXP3 and WT1 status and proportion of patients without DRFI and BCSS event at 5 years according to mean % of sTILs as a continuous variable a DRFI in non-pure and pure IMPC, p < 0.01. b BCSS in pure and non-pure IMPC, p = 0.07. c DRFI in FOXP3-positive and FOXP3-negative IMPC, p = 0.01. d DRFI in WT1-positive and WT1-negative IMPC, p = 0.02. e Estimated proportion of patients without distant relapse at 5 years by mean % of sTILs, p = 0.02. f Estimated proportion of patients without breast cancer-related death by mean % of sTILs, p < 0.01. BC breast cancer, DRFI distant relapse free interval, BCSS breast cancer-specific survival

Subsequently we measured by immunohistochemistry the proportion and the density of CD8-positive lymphocytes, to investigate whether IMPCs with high sTILs could have had a lower amount of CD8-positive cytotoxic lymphocytes. However, we did not observe any reduction in the proportion and density of CD8-positive lymphocytes in the IMPCs with high or intermediate sTILs as compared to IMPCs with low sTILs (Supplementary Fig. 2).

Additionally, we evaluated germinal centers in lymph nodes with and without IMPC metastasis and this for the group of high or intermediate sTILs in comparison with IMPC with low sTILs. We found no significant difference in the number of germinal centers in lymph nodes with metastasis in comparison with lymph nodes without metastasis. We also observed no differences in the number of germinal centers between pure vs non-pure IMPC nor between IMPC with high/intermediate vs low sTILs when analyses were performed for lymph nodes with and without tumor metastasis separately (Supplementary Tables 4, 5).

Prognostic correlations with spatial distribution of hot spot analysis, TLS and TAPC are shown in Table 3. Regarding sTILs hot spot analysis, higher stromal TILs in the hot spot region in the inner part of the tumor significantly correlated with worse DRFI (HR 1.05 [1.02;1.08] per 1% increment, p = 0.0004) and worse BCSS (HR 1.04 [1.01;1.08] per 1% increment, p = 0.0047). Higher stromal TILs in the hot spot located at the invasive front was significantly associated with worse BCSS (HR 1.05 [1.01;1.09] per 1% increment, p = 0.0053), while a trend toward worse DRFI was observed (HR 1.03 [1.00;1.06] per 1% increment, p = 0.0541). There were no significant correlations between the delta hot spot values and DRFI or BCSS, but a trend toward better DRFI in tumors with delta > 0 (more sTILs in hot spots at invasive front) as compared to IMPC with delta 0 (no differences in hot spot sTILs between invasive front and central region) was observed (HR 4.37 [0.94;20.32], p = 0.06) (Table 3).

The absence of TLS showed a trend toward better DRFI (HR 0.26 [0.06;1.13], p = 0.072), while there was no correlation nor trend observed with BCSS (HR 0.18 [0.02;2.00], p = 0.16).

Regarding the prognostic effect of TAPC, we observed no differences in DRFI and BCSS between IMPC with TAPC score 0 (no TAPC) or 1 (scattered TAPC in absence of micro-clusters) as compared to IMPC with TAPC score 2 (at least one isolated micro-cluster of 5 TAPC) or 3 (confluent micro-clusters of TAPC).

Given the very low numbers of events, we were not able to perform multivariate time-to-event outcome analysis. Caution in interpretation of the observed prognostic correlations is required.

Evaluation of sTILs in distant metastases and comparison with primary tumor

In 3 out of 8 patients presenting with distant metastasis, FFPE tissue was available for histopathological review. One patient presenting with liver and bone marrow metastasis after 80 months of follow-up; one patient presenting with liver metastasis after 37 months and one patient presenting with skin and pleural metastasis after 27 months of follow-up. Stromal TILs in these metastatic specimens in comparison with the primary tumor of these patients were 0% and 7% vs 11% in the primary tumor; 10% vs 55% in the primary tumor and 2% and 40% vs 40% in the primary tumor, respectively.

Discussion

Pure IMPCs are a special type of breast cancer, accounting for 0.9–2% of breast cancer cases [1]. Despite the high rate of lymph node metastases, prognosis of IMPCs in literature is debated [4, 6,7,8,9, 25, 26]. Previous IMPCs-specific studies described outcomes based mainly on overall survival in very heterogeneous patient populations, mostly using case–control strategy. Large scaled studies are relatively scarce in literature with only eight studies reporting on IMPCs series of more than 100 patients with a median follow-up of 39 to 72 months, of which only one reported on death related to breast cancer [6,7,8,9,10, 25,26,27,28,29,30]. Available information about long-term breast cancer-specific outcomes in IMPC is therefore very scarce.

In this study we report on a well characterized cohort of patients with pure and non-pure IMPCs that have underwent primary surgery for stage I-III breast cancer, investigating DRFI and BCSS in relation to sTILs scores and expression of specific prognostic biomarkers.

The median follow-up in our cohort was 100.6 months, which is remarkably longer as compared to the follow-up reported in previous studies (see above). We observed only seven cases of distant relapses and three cases of breast cancer-specific death, resulting in favorable outcomes despite the high proportion of poorly differentiated tumors and high rate of lymph node involvement in our study population. This observation is in line with the recent hypothesis that the prognosis of IMPCs is comparable or even better than that observed in invasive breast carcinoma of no special type [7,8,9, 29]. A possible explanation for the favorable outcome observed in this cohort is the high rate of ER-positive tumors in our study population (about 94% of which 50% luminal A-like), a finding also described by other authors [7].

An important limitation in our study is the very low number of events despite long-term follow-up, jeopardizing multivariable analysis. As such, performed univariate outcome analysis evaluating prognostic variables should be interpreted with caution. Despite the absence of significant differences in traditional prognostic factors or in administered treatments between pure and non-pure IMPC, all distant recurrences and breast cancer-specific deaths in this study population occurred in non-pure IMPCs. These findings might suggest intrinsic differences at the genomic level between pure and non-pure IMPCs. In our analysis, the global average and the hot spot analysis of sTILs and their spatial distribution, TLS and the patterns of distribution of TAPC were similar between pure and non-pure IMPC. However, a recent study based on array comparative genomic hybridization technology revealed no significant differences in the structural genomic profile and immunophenotype of mixed cases as compared to those of pure IMPCs [31, 32]. The implementation of more detailed molecular analysis at single cell level together with the integration of spatial information may be useful to better understand biologic differences between pure and non-pure IMPCs [33].

In this study we observed that high sTILs (as a continuous variable) were associated with a higher risk of distant recurrence and breast cancer-related death in IMPC. Moreover we found that 1% increment in sTILs in the hot spot region in the inner part of the tumor was significantly associated with worse DRFI and BCSS. Tumors with higher sTILs in the hot spot located at the periphery showed only a statistically significant association with a worse BCSS. Interestingly recent literature suggest that the spatial heterogeneity of the immune infiltrates may have a strong prognostic value both in ER- and ER + breast cancers [34, 35]. The identification of hot spot regions with the aid of a computer assisted imaging analysis on H&E has revealed that the location of the hot spot region is predictive by itself of a better or a worse prognosis, respectively, in ER− and ER + tumors. We consider our hot spot analysis as a surrogate of the computer guided imaging analysis described in the above referenced papers warranting further validation in other cohort of patients.

These observations should be interpreted carefully given the low number of events, but are in line with results previously reported by Guo et al. and Heindl et al. [13, 35]. High sTILs in luminal-HER2-negative breast cancer are regarded as adverse prognostic factor for survival, in contrast to HER2-positive and TNBC [36]. To better understand our findings, we hypothesized that IMPCs with high sTILs could have had lower amount of CD8 + cytotoxic lymphocytes. To explore this, we stained high and intermediate sTILs cases with CD8 immunohistochemistry and compared them to an equal amount of cases with low sTILs. Both the proportion and the measurement of the density of CD8 + lymphocytes showed a significant difference between the group with high-intermediate sTILs compared to the group with low sTILs (Supplementary Fig. 1). This result most likely reflects the scores obtained on H&E rather than an under representation of CD8 + cytotoxic cells in the IMPCs with high sTILs. Another possibility is that some IMPCs could have acquired mechanisms to evade immune surveillance. Previous research done on IMPC with prominent lymphocytic infiltration showed that IMPCs have a relatively lower number of effectively active tumor-killing cytotoxic T-lymphocytes as compared to breast carcinomas with medullary features [37]. Functional studies done with techniques like multiplex immunohistochemistry are expected to provide more insights in the co-expression of activation markers in T-lymphocytes present in the tumor micro-environment of IMPCs.

BCL2, P53 and FOXP3 expression has been variably linked to breast cancer behavior and subtype. Expression of BCL2 in breast carcinomas is generally associated with the expression of the estrogen receptor [38,39,40]. BCL2 has been reported to be an independent prognostic factor for breast cancer with a reduced likelihood of adverse outcome in case of BCL2 expression [41,42,43]. Evaluation of aberrant P53 staining has been recently linked to a worse overall survival in breast carcinoma [44]. P53 aberrant pattern by immunohistochemistry is predictive for pathogenic TP53 mutations, which are more frequently found in luminal B-like and basal-like tumors as compared to other breast cancer subtypes [45]. In our cohort, the correlations between staining patterns and outcome are difficult to interpret given the low number of events. However, the FOXP3 expression in breast cancer cells significantly correlated with a worse DRFI, which is consistent with results from a previous study [46].

Few IMPCs in our study population showed nuclear expression of WT1. Previous studies also reported a sporadic expression of WT1 in rare cases of IMPCs [47, 48]. Because of the potential differential diagnosis with a metastasis coming from a tubo-ovarian serous carcinoma we performed PAX8 immunostaining to further characterize our cohort. As expected none of the tumor showed PAX8 expression, despite the sporadic expression of WT1. However depending on the choice of PAX8 antibody, up to 3.5% of breast carcinomas can show weak PAX8 nuclear expression [49]. Interestingly, we found that lack of WT1 expression was related to worse DRFI in our study. These findings are inconsistent with other studies that investigated protein expression and mRNA-presence and found a worse prognosis in WT1-positive tumors in comparison with WT1-negative tumors [50, 51].

Next to the limitations by sample size and low number of events, the evaluation of TLS as performed in this study on H&E slides without additional immunohistochemistry warrants caution in interpretation of the analyses involving TLS given the known inconsistencies with scoring TLS on H&E [52]. A characterization of TLS based on the combination of CD21, BCL6, BCL2 and Ki67 with or without CD3 and CD20 would be in our opinion more desirable to achieve a better understanding of the TLS dynamics in IMPC or other histological breast cancer subtypes.

In conclusion, we described standard clinicopathological features and performed a quantitative and spatial analysis of sTILs and TAPC in a cohort of stage I–III IMPCs treated with primary surgery. Mean sTILs level was 20% with higher proportion of sTILs at the invasive front. Clinicopathological characteristics, quantitative and spatial analysis of sTILs and distribution patterns of TAPC were similar between pure and non-pure IMPC. Despite a high proportion of grade 3 tumors and lymph node involvement, we observed a low rate of distant recurrences and breast cancer-related death after a median follow-up of 100 months. All breast cancer recurrences occurred in the non-pure IMPC subgroup. Higher sTILs correlated with worse DRFI and BCSS in this IMPC cohort. Caution in interpretation of the observed prognostic correlations is required given the very low number of events, warranting validation in other cohorts.