There is increasing interest in leveraging ‘big data’ to address important clinical questions, with the ultimate goal of improving human health, including outcomes for those with cancer.1 Examples include ‘CancerLinQ’, an initiative developed by the American Society of Clinical Oncology to improve patient care by integrating large volumes of clinical data with analytical computing tools,2 and ‘Sentinel’, an initiative developed by the United States (U.S.) Food and Drug Administration to improve drug safety monitoring after regulatory approval of new drugs.3 Mining the big data can sometimes lead to big surprises.

In this issue of Breast Cancer, investigators from the U.S. National Cancer Institute leveraged big data that is routinely collected by the federally funded Surveillance Epidemiology and End Results (SEER) program. As described by the authors, SEER is a population-based cancer surveillance program covering ~30% of U.S. residents. Although SEER added breast cancer multi-gene test results to its data collection menu beginning in 2010, including the Genomic Health Oncotype DX Recurrence Score (RS), this analysis required electronically linking the SEER clinical data with the RS data generated between 2004 and 2010 via a collaboration with Genomic Health. The report included 38,568 patients with non-metastatic, estrogen-receptor (ER)-positive, HER2-negative breast cancer, or about 10-fold more than the 3,958 patients originally reported in all prior publications evaluating RS, including five clinical trial cohorts (B14,4 n=668; B20,5 n=651; ATAC,6 n=1017; S8814,7 n=367; and E2197,8 n=465), and one prior population-based cohort (n=790).9 Unadjusted 5-year breast-cancer-specific mortality (BCSM) rates for the low (<18), intermediate (18–30), and high (>30) RS groups were 0.4, 1.4, and 4.4% for node-negative disease (P<0.001), and 1.0, 2.3, and 14.3% for node-positive disease (P<0.001), respectively. RS remained significantly prognostic in the node-negative cohort when adjusted for age, tumor size, grade, race, and reported adjuvant chemotherapy use. TAILORx (Trial Assigning Individualized Options for Treatment) is a clinical trial designed to determine whether adding chemotherapy to endocrine therapy is beneficial in patients with a mid-range RS of 11–25 and node-negative, ER-positive, HER2-negative disease (NCT00310180).10 Using the modified RS categories employed in the TAILORx trial for low (RS<11), intermediate (RS 11–25), and high (RS >25),10 5-year BCSM rates in node-negative disease were 0.4, 0.7, and 3.6%, respectively (P<0.001). Although results have not yet been reported for the randomized group with a RS of 11–25, trial participants with a RS<11 treated with endocrine therapy alone had about a 1% risk of distant recurrence at 5 years.11 The RxPonder trial (Rx for Positive Node, Endocrine-Responsive Breast Cancer) is also evaluating the benefit of chemotherapy in a higher risk population with 1–3 positive nodes and a RS of 25 or lower (NCT01272037).12 Similar to the TAILORx low-risk registry, a recent report from a prospective registry of patients with node-negative, ER-positive, HER2-negative breast cancer reported highly favorable outcomes for patients with a low RS; 5-year distant recurrence rates were 0.5, 2.3, and 4.0% for a RS<18, 18–30, and >30, respectively, associated with corresponding chemotherapy use rates of 1%, 28%, 85%.13

Although the results of the SEER analysis do not define the precise RS threshold predictive for adjuvant chemotherapy benefit, they nevertheless provide clear evidence that the assay provides important prognostic information that is generalizable to clinical practice. Several points are worth noting in interpreting these results and placing them in context. First, the end point for the SEER analysis was 5-year BCSM. In prior reports, RS was shown to provide prognostic information for distant metastasis by 10 years, which is generally incurable and leads to death within about 3 years on average.14 Thus, the BCSM reflects not only distant metastasis, but also early distant recurrence rapidly leading to death, a clinically meaningful end point. Second, non-breast-cancer mortality was not associated with RS, providing greater confidence that the cause of death was properly classified. Third, although chemotherapy reduces the risk of mostly early recurrences within 5 years of diagnosis,15 late recurrence beyond 5 years accounts for at least one-half of recurrences in ER-positive breast cancer,1518 indicating that 5-year BCSM rates will underestimate the true mortality burden in this population.

Beyond the reassuring findings supporting the robustness of the prognostic information provided by RS, another important but surprising finding was the observation of higher 5-year BCSM rates among older women. In the multivariate model including only patients with node-negative disease, an age of 70 years and older was associated with significantly higher 5-year BCSM, which was driven largely by differences in the high RS group. One potential explanation is lower chemotherapy use associated with age. For the node-negative population in this cohort who had a high RS, chemotherapy use was reported in 78%, 75%, 74%, 57%, 56%, and 32% of patients less than 40, 40–49, 50–59, 60–69, 70–79, and 80 years or older, respectively. For patients with a node-positive disease and a high RS, chemotherapy use was reported in 52% of patients 70–79, and 50% of patients >80 years of age. These findings are particularly concerning in view of the previous data, likewise showing poorer breast-cancer-specific survival for older women with early-stage breast cancer.19 Although chemotherapy use was likely underreported in the SEER analysis, the findings again support the known age bias against chemotherapy use in older women, even in those at high risk of recurrence who could potentially derive greater benefit from therapy.20

Breast cancer is a disease of aging, with a threefold higher risk for women older than 70 years compared with age 50–59 years.21 Owing to increasing life expectancy22 and increasing incidence with advancing age, a 30% increase in the overall incidence of breast cancer is expected by the year 2030, driven largely by a 57% increase in women 65 years or older.23 Age alone should not be used to make adjuvant treatment decisions, as many older women in their 70 and 80s have life expectancies far in excess of 5 years, the follow-up time reported in this SEER-based study. For example, the life expectancy of a women in average health at 70, 75, and 80 years is 12, 9, and 7 years, respectively.24 In addition, older women with early breast cancer in good health appear to derive the same benefits from chemotherapy as in younger women,25 although there is greater toxicity.26 Broader use of validated online models that accurately estimate life expectancy, such as ePrognosis (http://eprognosis.ucsf.edu/index.php), may assist clinicians when formulating adjuvant treatment plans for older patients.

In conclusion, the findings from this population-based study provide additional evidence supporting the clinical validity of the prognostic information provided by the 21-gene RS assay, indicating that it is robust and provides clinically meaningful information that is generalizable to real-world clinical practice. The clinical utility of the assay will likely be further refined once the results of the TAILORx and RxPonder trials are available. In the interim, given the paucity of elderly women enrolled in prospective clinical trials,27 the findings from this study provide important information suggesting that greater chemotherapy use in elderly patients with high-risk breast cancer could substantially reduce early-breast-cancer mortality, if used in appropriately selected patients and in accordance with currently recommended guidelines.27