Introduction

Gastric cancer is the fifth most common cancer and the third leading cause of cancer-related death worldwide [1]. While very early-stage diseases can be potentially cured with endoscopic resection or surgery alone, combined modality therapies are needed to reduce the risk of recurrence for resectable disease. For patients with locally advanced gastric cancer, adjuvant treatment can prolong survival, but standard adjuvant treatments differ regionally based on the pivotal trials. In Eastern Asian countries, adjuvant chemotherapy using S-1 or capecitabine plus oxaliplatin is the current standard based on the ACTS-GC [2] or CLASSIC [3] trials, respectively.

The survival outcomes of patients receiving gastrectomy and adjuvant chemotherapy in Eastern Asia are reportedly better than those of patients in Western countries, which is attributable to a nationwide screening system leading to an earlier diagnosis [4] and the widespread application of D2 dissection [5]. Nevertheless, patients with an advanced stage at diagnosis still exhibit poor clinical outcomes, with five-year overall survival rates of 67–70 and 50–66% for pathologic stage IIIA and IIIB, respectively [6, 7]. Furthermore, adjuvant chemotherapy is often delayed following surgical resection due to surgical morbidities, which could negatively impact clinical outcomes [8], and chemotherapy after gastrectomy is associated with frequent adverse events [9].

Neoadjuvant chemotherapy can be considered to intensify chemotherapy and allow for the initiation of chemotherapy at an earlier time when patients are more medically fit. In the recent phase 3 PRODIGY study, we demonstrated that patients who received additional neoadjuvant chemotherapy followed by surgical resection and adjuvant S-1 (CSC group) exhibited significantly improved progression-free survival (PFS) compared to those who received standard therapy with surgical resection and adjuvant S-1 only (SC group) [10].

While the PRODIGY study met its primary endpoint, defining optimal candidates for neoadjuvant treatment should be further investigated. Although the PRODIGY study included patients with the locally advanced disease by clinical assessment (i.e., cT2/3 N + or cT4/Nany), these radiological criteria were associated with a non-negligible proportion of pathological stage (pStage) I disease (11%) in the SC group [10]. Considering that patients with pStage I disease could be sufficiently treated without (neo)adjuvant chemotherapy, baseline radiological criteria to minimize the inclusion of early-stage disease are warranted. This concept is further supported by the subgroup analyses of the PRODIGY study showing that the PFS and OS benefit of neoadjuvant chemotherapy was more prominent in advanced clinical stages [10]. In the JCOG1302A study, the cT3/T4N + stage was proposed as a potential criterion for neoadjuvant treatment because it was associated with a lower proportion of pathological stage I disease (6.5%) among patients who received surgery up-front [11]. However, no study has evaluated radiological criteria with regards to the clinical outcomes of patients treated with neoadjuvant chemotherapy. Furthermore, the difficulties of accurate clinical T staging [11, 12] and inconsistent radiological evaluations of lymph nodes (LNs) among the studies [11, 13,14,15,16,17,18,19] may make it challenging to define a patient subgroup that will benefit the most from neoadjuvant chemotherapy.

In this exploratory analysis from the PRODIGY study, we aimed to define patient subgroups who might optimally derive clinical benefit from neoadjuvant chemotherapy. The correlation between radiological-pathological assessments was evaluated based on Clinical T (cT) stage, determined by the invasion depth and the radiological methods applied for LN evaluation adopted by the PRODIGY and JCOG1302A studies. We aimed to delineate radiological staging criteria to exclude pathologically early-stage diseases and include an acceptable proportion of pathologically advanced diseases. We also examined whether applying these criteria could be translated into better clinical outcomes by administering neoadjuvant chemotherapy to the appropriate patient subgroup.

Patients and methods

Study patients and treatments

A total of 484 patients were derived from the full analysis set (FAS) of the PRODIGY study [10]. Key eligibility criteria were an age of 20–76 years, an Eastern Cooperative Oncology Group performance status of 0–1, histological confirmation of primary gastric or gastroesophageal junction adenocarcinoma, and resectable and locally advanced disease as defined by cT2,3/N ( +) or cT4/Nany stage by the American Joint Committee on Cancer, 7th Edition. The CSC group (n = 238) was allocated to receive three cycles of neoadjuvant DOS (docetaxel 50 mg/m2 and oxaliplatin 100 mg/m2 iv day 1, S-1 40 mg/m2 po bid days 1–14 q3w), D2 surgery, and eight cycles of adjuvant S-1 (40–60 mg po bid days 1– 28 q6w). The SC group (n = 246) was allocated to receive D2 surgery followed by eight cycles of adjuvant S-1. This study was approved by the institutional review boards at all participating institutions, and written informed consent was provided by all study subjects.

Radiological methods to assess invasion depth and lymph node positivity

To assess study eligibility and patient stratification in the original PRODIGY study, baseline CT scan images uploaded to the study website were reviewed by a central reviewer, JSL, a board-certified abdominal radiologist with more than 10 years of experience in abdominal imaging.

cT stage was determined according to the depth of invasion: cT2 with muscularis propria invasion; cT3 with subserosal connective tissue invasion; cT4a with serosal invasion; and cT4b with invasion to adjacent structures [20]. In the original PRODIGY study, clinical lymph node positivity was assessed based on the ‘size and morphology method’, which classified lymph nodes as positive when the short axis was ≥ 8 mm (irrespective of the lymph node shape) or the shortest diameter was ≥ 5 mm with central necrosis, a round shape, perinodal infiltration, and/or prominent enhancement [15, 21,22,23]. For this post hoc exploratory study, lymph node status was additionally reviewed by the central reviewer based on the ‘size only method’ by which patients were considered lymph node-positive when they had lymph nodes with a short axis of ≥ 8 mm regardless of their morphologic features [11, 18, 19].

Correlation between radiological staging and pathological staging

The correlation between radiological and pathological assessments was analyzed for patients in the SC group (n = 246) (Figure S1A). We examined the proportion of patients with different pathological stages according to clinical T stage and T/N stage by radiological methods. The proportion of pStage I disease and sensitivity for pStage III disease were evaluated according to several criteria, including different TN stages assessed by each radiological method.

In the SC group, after excluding 19 patients who underwent open and closure (n = 16) and palliative surgery (n = 3) due to metastatic disease at the time of surgery, we analyzed 227 patients who had surgical specimens available for pathological analysis. Among these patients, we examined the concordance between radiological and pathological assessment for T staging and LN positivity. We also examined the sensitivity, positive predictive value, specificity, and negative predictive value of forecasting each pathological T stage and LN positivity by radiological assessments.

Efficacy of neoadjuvant chemotherapy according to each clinical stage

For the FAS population involving both CSC and SC groups, we assessed the hazard ratio (HR) for PFS of the CSC arm in subgroups defined by different radiological criteria by both radiological methods of LN evaluation (Figure S1B).

Statistical analysis

PFS was defined as the interval from randomization to the date of disease progression (PD) or death. PD was determined using RECIST v1.1 during neoadjuvant chemotherapy for patients in the CSC group. Distant metastases identified during surgery or R2/R1 resection not resolved by subsequent surgery were considered PD. Recurrence/distant metastasis during follow-up after R0 resection was deemed PD. OS was defined as the interval from randomization to the date of death from any cause. The comparison of sensitivity and specificity was based on the McNemar and exact binomial tests. The comparison of positive and negative predictive values was based on a comparison of relative predictive values. Cox proportional hazard modeling was used to assess the relative risk reduction of the CSC group (vs. the SC group). A p value of < 0.05 was considered statistically significant. Statistical analyses were performed using R software version 3.6.2 (R Foundation for Statistical Computing, Vienna, Austria).

Results

Study patients

Baseline patient characteristics are described in Table 1. The median age was 58 years for both groups. The proportion of patients with each cT stage was comparable between the two groups (p = 0.872). The proportion of patients with clinically positive LNs by the size and morphology method was 96.7 and 98.3% for the SC and CSC groups, respectively, while with the size-only method, it was 58.2 and 57.1% for the SC and CSC groups, respectively.

Table 1 Clinical characteristics of the study patients

Correlation of T stage between radiological and pathological assessments

The distribution of pathological T stages according to cT stage was examined in 227 patients who could be pathologically evaluated (Fig. 1A). With an overall concordance rate of 48.2%, concordance rates for cT2, cT3, cT4 were 30.7, 40.7, and 51.8%, respectively. Among the different clinical T stages, cT4 disease showed the highest sensitivity (85.6%) and positive predictive value (51.9%) for accurately predicting the pathologic T stage (Table S1).

Fig. 1
figure 1

Concordance between clinical and pathological T stages. The proportion of each pathological T stage (A) and overall pathological stage (B)

For patients in the SC group (n = 246), each T stage was examined by the proportion of overall pStages. For cT2 and cT3, the proportion of pStage I was 54.5% and 23.2%, respectively, whereas cT4 disease had a pStage I proportion of 4.5%. (Fig. 1B). In patients with cT4 disease, the proportion of pStage III and pStage IV was 63.8 and 17.9%, respectively.

Lymph node positivity by two different radiological assessment methods

Among pathologically evaluated patients (n = 227), 220 (96.9%) and 130 patients (57.3%) had clinically positive LNs assessed by the size and morphology and size only methods, respectively. The concordance for LN positivity by each method is presented in Table S2. Among the 90 patients who had clinically positive LNs by the size and morphology method but clinically negative LNs by the size-only method, 59 (65.6%) were found to have malignant LNs by pathological assessment.

The sensitivity for detecting pathologically malignant LNs was significantly higher by the size and morphology method (97.1%) than by the size-only method (63.2%) (p < 0.001) (Table S3). The positive predictive value was higher with the size-only method (84.6%) compared to the size and morphology method (76.8%) (p = 0.002) (Table 2). The size and morphology method exhibited lower specificity (3.8 vs. 62.3%, p < 0.001) than the size only method, while the negative predictive value was comparable between the two methods (28.6 vs. 34.0% for the size and morphology and size only methods, respectively, p = 0.763) (Table S3).

Table 2 Proportion of pathological stage I disease and sensitivity for pathological stage III disease by criteria based on each radiological method of evaluating lymph node

Pathological stage according to different TN stages by the radiological methods for assessing lymph node positivity

The percentages of pStage I disease were 46.2 and 23.2% for cT2N + and cT3N + , respectively, for patients in the SC group (n = 246) according to the size and morphology method adopted by the original PRODIGY study (Fig. 2A). The T/N stages based on the size and morphology method involving cT4 disease had ≥ 62.7% of pStage III disease (Fig. 2A).

Fig. 2
figure 2

Distribution of pathological stages according to clinical T and N stages. The proportion of each pathological stage was represented based on the size and morphology method (A) and size-only method (B)

As the size-only method did not clinically classify patients with LNs with a short diameter of ≥ 5 mm and characteristic morphologic features as having malignant LNs, 4 and 27 patients were reclassified as having cT2N0 and cT3N0 disease, respectively, by the size method. While the percentage of pStage I disease was at least 22.2% for cT2N0, cT3N0, cT2N + , and cT3N + diseases, it was 6.7% or lower for cT4aN0, cT4aN + , cT4bN0, and cT4bN + diseases (Fig. 2B).

By either method of LN evaluation, cT3N + , cT4aN0, cT4aN + , and cT4bN + had cases with M1 disease. Among these TN stages, cT3N + had the lowest percentage of M1 disease (3.5 and 6.9% by the size and morphology and size only methods, respectively), whereas the others (i.e., cT4aN0, cT4aN + , and cT4bN +) had around 10% with M1 disease according to both methods (Fig. 2B).

The proportion of pathological stage I disease and the sensitivity for pathological stage III disease by radiological criteria, including different TN stages

Based on the size and morphology method, the following criteria included patients with different TN stages: criterion 1 with cT2/3 N + or cT4Nany; Criterion 2 with cT3N + or cT4Nany; Criterion 3 with cT3/4 N + ; Criterion 4 with cT4Nany; and Criterion 5 with cT4N + . The percentages of pStage I disease by criteria 1, 2, and 3 ranged from 9.3 to 11.9%, while the sensitivity for detecting pStage III disease by these criteria was ≥ 92.2%. Criterion 4 had 5.0% pStage I disease and an 80.7% sensitivity for pStage III disease, while Criterion 5 had corresponding values of 4.5 and 75.9% (Table 2).

By the size only method, which classified patients differently than the size and morphology method, the following criteria were evaluated: Criterion A, with cT2/3Nany, which includes the same population as Criterion 1; Criterion B, with cT2N + or cT3/4Nany; Criterion C, with cT3/4Nany, which includes the same population as Criterion 2; Criterion D, with cT2/3 N + or cT4Nany; Criterion E, with cT3/4 N + or T4Nany; Criterion F, with cT3/4 N + ; Criterion G, with cT4Nany, which matched Criterion 4; and Criterion H, with cT4N + . For Criteria A-F, the percentage of pathological stage I disease was 7.4–11.9%, while the sensitivity for pStage III disease was ≥ 92.4%, except for Criterion F (61.0%). By Criteria G and H, the pStage I percentages were 5.0 and 4.3%, respectively, while the sensitivity for pStage III was 80.7 and 51.1%, respectively (Table 2).

Efficacy of neoadjuvant chemotherapy according to the radiological criteria

HR for PFS with neoadjuvant chemotherapy was then evaluated in the patient subgroups meeting each radiological criterion (Fig. 3). Among the patient subgroups determined based on the size and morphology method, the relative risk reduction for PFS was most prominent in patients meeting Criterion 4 (cT4Nany) [n = 343 (70.9%), HR 0.67 95% CI 0.48–0.93, p = 0.019] and Criterion 5 (cT4N +) [n = 331 (69.0%), HR 0.68 95% CI 0.49–0.94, p = 0.032). As for the patient subgroups defined by the size only method, those meeting Criterion G (cT4Nany) (n = 343), the same criterion as Criterion 4 of the size and morphology method, showed the lowest HR for PFS (HR 0.67 95% CI 0.48–0.93, p = 0.019). The HR for the PFS of the CSC group in patients with cT2/3 (n = 141) was 1.42 (95% CI 0.66–3.06, p = 0.370).

Fig. 3
figure 3

Relative risk reduction of the CSC group for progression-free survival in subgroups determined by different T/N stages. Hazard ratio of the CSC group for progression-free survival based on a different radiological method of lymph node evaluation

Patients meeting the cT4Nany criterion also showed the lowest HR for OS with neoadjuvant chemotherapy, though this was not statistically significant (HR 0.78, 95% CI 0.54–1.14, p = 0.220) (Figure S2).

Discussion

In this exploratory analysis from the PRODIGY study, we investigated baseline radiologic assessments to identify a subgroup of patients with gastric cancer who may benefit the most from neoadjuvant chemotherapy. With a suboptimal radiological-pathological correlation based on the size and morphology method used in the PRODIGY study, we additionally employed the size-only criterion. Based on either radiological method, the patient subgroup with cT4Nany exhibited a lower percentage of pathologic stage I disease (5%) while preserving sensitivity for pathologic stage III disease (80.1%). Accordingly, the relative risk reduction from neoadjuvant chemotherapy was most prominent in patients meeting this criterion. Our results indicate that neoadjuvant chemotherapy may be preferentially considered for patients with cT4 disease. Since the addition of oxaliplatin or docetaxel to adjuvant S1 in patients with pathological LN positivity or pStage III tumors, respectively, improved the clinical outcomes in an adjuvant chemotherapy setting [14, 24], defining optimal candidates for neoadjuvant chemotherapy is increasingly important. To our knowledge, this is the first study to explore the potential patient selection criteria in a neoadjuvant treatment setting for patients with gastric cancer.

Neoadjuvant chemotherapy is selected based on the clinical staging at baseline; therefore, reliable radiological assessments are necessary for patient selection. Our findings point to the importance of identifying the cT4 stage. On CT images, cT4a tumors are characterized by an irregular outer margin of the outer layers (vs. a smooth margin in cT3) and a dense band-like perigastric fat infiltration (vs. linear fat infiltration in cT3) [13, 20]. Given the suboptimal concordance between clinical and pathological T staging in the JCOG1302A study [11] and in ours, endoscopic ultrasound (EUS) may be considered for further T staging and differentiation between early-stage versus locally advanced disease [23, 25]. However, the diagnostic yield of EUS is reportedly comparable to that of CT [12, 26, 27], and EUS did not improve the diagnostic accuracy of T staging in the JCOG1302A study [11]. Moreover, cT4 stage assessed by CT showed an acceptable sensitivity to detect pT4 and was associated with a minimal proportion of pStage I disease in our analysis. Therefore, CT-based evaluation would be a realistic modality for patient selection in the context of selecting neoadjuvant chemotherapy.

Radiological LN evaluation is challenging and lacks worldwide consensus. There have classically been two radiological methods for evaluating LN positivity: the size-only method, which counts the short-axis diameter with cutoffs of 8–10 mm [11, 18, 19], and the size and morphology method, which considers the shape, contour, and enhancement pattern together with a lower size cutoff (5 mm) [15, 21,22,23]. While the latter is recommended by the European Society for Medical Oncology guideline [23], there has been no study that directly compared the usefulness of these methods. Our data indicate that the size and morphology method could more sensitively detect pathological LN positivity, but the positive predictive value was better with the size-only method.

As per the previous approach, we attempted to delineate the clinical criteria to maximize the inclusion of pStage III disease and minimize the inclusion of pStage I disease, given the trade-off between the inclusion of pStage I disease and the sensitivity for pStage III disease [11]. By both methods of evaluating the LNs, criteria that involve cT4 disease (i.e., Criteria 4 [cT4Nany] and 5[cT4N +] by the size and morphology method and Criteria G [cT4Nany] and H [cT4N +] by the size only method) included a minimal proportion of pStage I disease (≤ 5%). Furthermore, cT4Nany by both methods and cT4N + by the size and morphology method (Criterion 5) exhibited an acceptable sensitivity for detecting pStage III disease (≥ 75.9%), whereas cT4N + by the size method (Criterion H) showed lower sensitivity (51.1%). The criterion previously suggested by the JCOG1302A study [8] (cT3/4 N + by the size only criteria) exhibited an 8.1% proportion of pStage I, and a 61.1% sensitivity for pStage III disease in our cohort, which appears to be relatively suboptimal. This may be attributable to the different patient inclusion criteria: the JCOG1302A included cT2-4 disease regardless of LN status, whereas the PRODIGY study included cT2/3 disease only when they were deemed clinically LN positive. This discrepancy suggests the need for further validation of these results.

On the other hand, about 17% of patients with cT4 disease had pStage IV in the SC group. Neoadjuvant chemotherapy may potentially downstage advanced cancers with limited distant metastasis, thereby making these tumors resectable. The results of the AIO-FLOT3 study of patients with limited metastatic disease (M1) who received surgery following neoadjuvant chemotherapy exhibited a favorable OS (median 31.3 months), which is also in line with this concept [28]. Therefore, the cT4-based criterion appears to be a rational choice for selecting candidates for neoadjuvant chemotherapy.

One of the most important aspects of this study is that the association with the pathological findings of the cT4-based criterion was translated into PFS benefits. The presence of the well-balanced control arm (SC group) from the clinical trial enabled a comparison of the clinical outcomes, which enhances the value of the analyses. The criteria with a minimal proportion of pStage I (cT4Nany by both methods and cT4N + by the size and morphology method) were associated with the lowest hazard ratio for PFS (HR 0.67 and 0.68, respectively). The previously suggested cT3/4 N + criterion by the size-only method showed a less prominent risk reduction (HR 0.86), which is in line with the results of our correlation analysis of the radiologic and pathologic assessments. Given that the cT4N + criterion excludes patients with cT4N0 disease, applying the cT4Nany criterion appears to be rational, at least in this neoadjuvant DOS context.

Although our results suggest that patients with cT4Nany disease may preferentially benefit from neoadjuvant chemotherapy, whether patients with cT2/3 disease should not receive neoadjuvant chemotherapy needs to be cautiously interpreted. In our subgroup analysis, a clinical benefit of neoadjuvant chemotherapy in patients with cT2/3 disease was not evident (HR 1.42; p = 0.370), and these stages were associated with a non-negligible proportion of pStage I disease (≥ 23.2%). Given that patients with pStage I disease exhibit favorable clinical outcomes (5-year DFS rate around 90%) [29, 30], these results raise the potential overtreatment issue for patients with early-stage diseases. This is particularly a more relevant issue in Asia because the diagnosis of gastric cancer is usually made at later stages in Western countries. Indeed, despite the similar inclusion criteria of the FLOT4 study (i.e., ≥ cT2 or cN + or both) [31] with that of the PRODIGY study (i.e., cT2/3 N + or cT4/Nany), worse survival outcomes in the FLOT4 indicate the predominant inclusion of patients at later stages in Western countries. Whether these patients could benefit from neoadjuvant chemotherapy can only be answered by a randomized controlled trial (RCT). However, because of the low rate of disease-free survival (DFS) events and potential overtreatment in these patients, an RCT addressing this issue may be practically difficult to conduct, especially in Asia. Since patients with cT2/3 N + disease were included in the main analysis of the original PRODIGY, it may be inappropriate to conclude that these patients should not be considered for candidates for neoadjuvant chemotherapy, especially in Western countries.

This study has some limitations to consider, including the fact that the analyses were not pre-specified before the PRODIGY study was initiated. First, because the PRODIGY study included patients with cT2/3 N + or T4Nany diseases, the proportion of patients with different TN stages may not fully recapitulate that of actual clinical practice. Second, our findings were not validated in an independent cohort, although there is no appropriate cohort for validation at this time. Third, the OS results remain immature. Because the original PRODIGY study included a higher proportion of pathologically earlier stage diseases than expected, the planned number of PFS events was not reached and the observed power for OS events was only 17% [10]. Therefore, it may be difficult to achieve a statistically significant difference for OS analysis, but the same trend for OS benefit in patients with cT4 disease supports the main idea of this study. Although a DFS benefit based on a similar definition of PD that considers incomplete resection as an PD event was translated into an OS benefit in a neoadjuvant setting [32], the surrogacy of PFS for OS has not been firmly established in this context. Therefore, our findings should be confirmed with long-term follow-up data. We are currently waiting for the maturation of OS data and planning to report the updated OS results of the PRODIGY study.

In conclusion, gastric cancer patients with cT4Nany disease may preferentially benefit from neoadjuvant chemotherapy in the PRODIGY study. The cT4-based criterion may help select patients for neoadjuvant chemotherapy and guide future clinical studies of neoadjuvant chemotherapy, especially in Asia. Our findings require further investigation and validation in independent cohorts with different neoadjuvant regimens.