Predicting early breast cancer recurrence from histopathological images in the Carolina Breast Cancer Study

Shi, Yifeng; Olsson, Linnea T.; Hoadley, Katherine A.; Calhoun, Benjamin C.; Marron, J. S.; Geradts, Joseph; Niethammer, Marc; Troester, Melissa A.

doi:10.1038/s41523-023-00597-0

Download PDF

Article
Open access
Published: 11 November 2023

Predicting early breast cancer recurrence from histopathological images in the Carolina Breast Cancer Study

npj Breast Cancer volume 9, Article number: 92 (2023) Cite this article

1901 Accesses
1 Citations
26 Altmetric
Metrics details

Subjects

Abstract

Approaches for rapidly identifying patients at high risk of early breast cancer recurrence are needed. Image-based methods for prescreening hematoxylin and eosin (H&E) stained tumor slides could offer temporal and financial efficiency. We evaluated a data set of 704 1-mm tumor core H&E images (2–4 cores per case), corresponding to 202 participants (101 who recurred; 101 non-recurrent matched on age and follow-up time) from breast cancers diagnosed between 2008–2012 in the Carolina Breast Cancer Study. We leveraged deep learning to extract image information and trained a model to identify recurrence. Cross-validation accuracy for predicting recurrence was 62.4% [95% CI: 55.7, 69.1], similar to grade (65.8% [95% CI: 59.3, 72.3]) and ER status (66.3% [95% CI: 59.8, 72.8]). Interestingly, 70% (19/27) of early-recurrent low-intermediate grade tumors were identified by our image model. Relative to existing markers, image-based analyses provide complementary information for predicting early recurrence.

Integration of clinical features and deep learning on pathology for the prediction of breast cancer recurrence assays and risk of recurrence

Article Open access 14 April 2023

AI-enabled routine H&E image based prognostic marker for early-stage luminal breast cancer

Article Open access 15 November 2023

Deep learning predicts postsurgical recurrence of hepatocellular carcinoma from digital histopathologic images

Article Open access 21 January 2021

Introduction

Early recurrence, herein defined as the return of a primary tumor within three years of diagnosis, is an important endpoint in clinical management of breast cancer¹. Recurrences can often be successfully managed, but they are stressful, costly, and increase risk of mortality if not detected early^2,3,4. Clinical risk stratification is currently based on several clinical characteristics, including hormone receptors, HER2 status, grade, stage, and age, and RNA-based methods are available to identify tumors with high risk of recurrence^5,6,7,8. Clinical gene expression assays are not uniformly performed on all patients, and are often limited to specific subgroups of patients with low-stage and ER-positive disease. Genomic assays are also expensive⁹, so histopathology-based stratification is appealing. Currently, only combined histologic grade—a metric that classifies breast tumors according to tubule formation, nuclear pleomorphism, and mitotic frequency–is routinely collected from H&E images in the clinic. Grade evaluation is performed manually and is subject to interobserver variability. An objective, image-based method could be valuable for prescreening patients at higher risk of recurrence.

Recent work in computer vision has extensively explored using deep convolutional neural networks (CNNs) to extract global contextual information from a variety of image types. Early applications of CNNs primarily were focused on natural images (e.g., cars or birds)¹⁰, but more recently, methods have been extended to medical images, including radiographic and histopathologic images^11,12,13. Machine learning methods in image classification have been shown to predict or diagnose invasive breast cancer incidence, using both histopathological and radiographic images^{14,15,16,17,18}. However, few studies have evaluated breast cancer outcomes based on images, and most that have have been limited in sample size, range of tumor phenotypes, or patient diversity^19,20,21. For example, there are several data sets (e.g., the Camelyon challenge in the Netherlands and IBM-curated BRIGHT) that have encouraged researchers to investigate benign and neoplastic breast tissue using machine learning methods; however, these are largely focused on diagnostic capacity rather than prognostic or predictive modeling. Campanella and colleagues used WSI from multiple cancers to develop a predictive model of invasive disease, but again, this work was focused on diagnostic rather than prognostic applications¹⁶. Many other previous studies have emphasized a priori hypotheses, such as associations with spatial arrangement of immune cells²² or emphasized overall or breast cancer-specific survival rather than recurrence. Furthermore, many data sets—even for diagnostics—do not include diverse populations of women with breast cancer. In the US, Black women have significantly higher recurrence rates and breast cancer mortality, but often have lower representation in clinical and observational research. We used data from a source that represented both Black and non-Black women in similar proportions, allowing us to investigate breast cancer recurrence in a diverse setting.

We sought to investigate whether we could use image information extracted with a CNN (VGG16²³, a CNN pre-trained on ImageNet) together with support vector machines (SVM)²⁴ to create image-based classes that were predictive of recurrence among breast cancer patients. We assessed reproducibility and inter- and intra-individual variance by comparing validation accuracy across and within patient specimens and compared results to existing, established biomarkers. The Carolina Breast Cancer Study (CBCS3) is a well-annotated image dataset for a diverse group of women (50% Black, 50% under age 50) who were followed for medical record-confirmed recurrence.

Results

Study population

Table 1 shows our study population and indicates that the matched dataset (n = 101) had a similar distribution relative to the full population represented on the tissue microarrays (TMAs, n = 1543). However, the subset of CBCS cases included on the TMA tended to include more large size tumors and higher-grade tumors relative to the entirety of CBCS. Relative to participants who experienced an early recurrence, age-matched participants without early recurrence were significantly more likely to be early stage (stage 1 52.5% vs 10.9% in early recurrences), grade 1 or 2 (58.4% vs 26.7% in early recurrences), and ER-positive (76.2% vs 43.6% in early recurrences).

Table 1 Demographic and clinical tumor characteristics for the full study population and the matched training sample, stratified by 3-year recurrence status.

Full size table

Model prediction accuracy

First, we assessed the accuracy for detecting recurrence within three years of diagnosis in our balanced data set, displayed in Table 2. In cross-patient 10-fold cross validation we observed 62.4% accuracy and 63.4% sensitivity. However, using within-patient validation, accuracy was 70.3% (67.7% sensitivity). In both approaches, the sensitivity and specificity were well-balanced, with within-patients (72.9%, 95% CI: 64.2, 81.6) specificity slightly higher than sensitivity (67.7%, 95% CI: 58.6, 76.8) and cross-patients sensitivity (63.4%, 95% CI: 54.0, 72.8) slightly higher than specificity (61.4%, 95% CI: 51.9, 70.9). To contextualize these accuracy estimates, we also evaluated the accuracy of standard clinical markers. Using grade and ER status as predictors of recurrence resulted in accuracies of 65.8% and 66.3%, respectively, but grade had higher sensitivity (73.3%, 95% CI: 64.7, 81.9) while ER status had higher specificity (76.2%, 95% CI: 67.9, 84.5). In pre-screening tumors for genomic testing, sensitivity to detect aggressive tumors is higher priority.

Table 2 Recurrence prediction accuracy comparison between image-based classes and other tumor characteristics (ER, grade).

Full size table

To investigate recurrence prediction accuracy among clinically low or high-risk tumors, we further stratified our accuracy assessment by grade (low/intermediate vs high) (Table 2). Accuracy was higher in the low/intermediate grade group compared to high grade for both the within-patients approach (77.1% vs 65.2% in low vs. high grade) and the cross-patients validation approach (61.6% vs. 53.4% in high-grade). The sensitivity was lower among low/intermediate-grade tumors, while specificity was lower for high-grade tumors. However, sensitivity of both image-based approaches in the low/intermediate group exceeded that for ER status (70.4% for within-patients, 48.1% for cross-patients vs. 22.2% for the ER status). A total of 19 low/intermediate group patients (70% of patients with low/intermediate grade tumors who recurred within 3 years) were detected by image analysis that would have been missed via grade alone.

Time-to-event analysis

To also consider time-to-event (and not just binarized early recurrence vs. not), we evaluated both the within- and cross-patients predictors in time-to-recurrence based on Kaplan–Meier analysis (Fig. 1). Image-based classes from the within-patients approach (HR 2.70; 95% CI: 1.78, 4.11) had a slightly stronger hazard of recurrence than the cross-validation-derived classes (HR 1.73; 95% CI: 1.16, 2.57), but both were significantly associated with time to recurrence.

**Fig. 1: Kaplan-Meier plots for the cumulative incidence of recurrence.**

Comparison with genomic assays

When comparing image-based classes to existing molecular metrics that represent risk of recurrence, specifically research-based versions of PAM50-derived ROR-PT and OncotypeDX scores, we observed that the image-based “high-risk” class was substantially enriched for individuals whose tumors had also been classified as high-risk by either ROR-PT or OncotypeDX (Table 3). The cross-patients approach resulted in an image-based high-risk class with the highest proportion of molecularly high-risk individuals [OncotypeDX RFD (95% CI): 15.0% (8.6, 21.3), ROR-PT RFD (95% CI): 21.5% (14.2, 28.5)], though the high-risk class resulting from the within-patients approach was still substantially associated with high-risk molecular features [OncotypeDX RFD (95% CI): 11.7% (5.3, 18.1), ROR-PT RFD (95% CI): 17.4% (10.0, 24.5)].

Table 3 Relative frequency differences (RFDs) for RNA-based risk of recurrence classifiers.

Full size table

Discussion

We applied convolutional neural networks to detect early recurrence in the diverse, clinically well-annotated Carolina Breast Cancer Study. We found that image-based features predict survivorship with accuracy, sensitivity, and specificity that are comparable to those for standard clinical markers such as estrogen receptor status and grade. The performance characteristics of image-based classifiers differed within strata defined by grade, suggesting that future optimization should consider training separate within strata of grade. However, these image-based classifiers predicted recurrence with significant hazard ratios, and showed associations with risk-based genomic signatures. Since genomic signatures were not used in training, the association with genomic data suggests promise for rapid, low-cost pre-screening of tumors that may need further genomic testing. The importance of image-based pre-screening may also be of increasing importance as the proportion of neoadjuvant-treated breast cancer cases increases²⁵, because in neoadjuvant cases tissue for diagnostic purposes is limited to biopsy materials. It may also be advantageous that the image-based methods use the same data collected for diagnosis and do not require any additional laboratory steps.

Our analysis is unique in that we trained on recurrence rather than other genomic or clinical data. Previous machine learning studies have predicted breast cancer recurrence and survival, most commonly using clinical and demographic data as inputs to ML algorithms^26,27,28,29. Lou et al. compared an array of computational methods on a registry data consisting of 1140 patients, using collected medical records as inputs to the machine learned classifier²⁶. Our approach used a much smaller image dataset for training but does not assume that the clinical data are mediators of the recurrence outcomes, allowing us to discover features that may not be captured in other clinical data. The promising results obtained with a small sample size suggest that future, larger studies with more images may improve accuracy further.

Our recurrence-trained classifier also recapitulated genomic risk subtypes, with high image-based risk groups being more likely to have high genomic ROR-PT and OncotypeDX scores. Other researchers have predicted genomic scores from images. Whitney et al. used 178 breast tumor H&E images to predict RNA-based OncotypeDX scores¹³. Accuracy in that study was 74% for low-intermediate vs high OncotypeDX score. We did not compare to Oncotype DX, which limits our ability to directly compare our accuracy for a PAM50-based risk of recurrence score, but our results suggest that training on recurrence rather than recurrence score is a viable strategy. Also, within the CBCS, our group demonstrated that ML methods could be utilized to predict tumor features such as ER status, grade, and subtype using images and data from CBCS3¹². Accuracy in that analysis for prediction of high vs low-medium ROR-PT score was 75%, which is slightly higher than our results for recurrence. However, the training set for that analysis was much larger than the recurrence vs. non-recurrence dataset used here and the outcome was more common. Thus, future analyses with larger datasets should evaluate the optimal method for identifying high risk specimens.

Application of breast tumor tissue core images to predict the binary recurrence outcomes is a difficult problem because of a few key challenges. First, the input data from CNN (512 times the number of samples) is much more high-dimensional than researcher-selected features like grade, stage, and other clinical characteristics. Second, early recurrence rates in breast cancer are fortunately relatively low, but this results in recurrence data from breast cancer cohorts being highly imbalanced. Using a matching scheme to match each recurrent case with a non-recurrent participant allowed us to overcome some of the challenges of using machine learning techniques while working with such a strongly imbalanced data set.

Higher sensitivity when training within-patients suggests some similarity of the images in training was being leveraged in testing and raises the intriguing hypothesis that repeated samples of images from patients have some individuality or ‘identifiability’. Whether this identifiability is clinically meaningful merits further investigation. On the molecular level, Perou et al. suggested that tumors are individuals and that this individuality may be targetable for precision medicine³⁰; if tumors are similarly individual on a histological level, perhaps machine-learning techniques to evaluate histologic distinctions across a tumor could be used to identify subgroups or to study tumor evolution between biopsies and excisions/mastectomies. In any case, establishing the reproducibility of classification across samples from a given tumor is an important future direction if histologic biomarkers are to be used for risk prediction.

In summary, our proposed image-based approaches achieve competitive prediction accuracies on the order of established biological and clinical markers (i.e., grade and ER status), with balanced sensitivity and specificity. Among patients in the low/intermediate grade subgroup, both approaches were more sensitive than ER status. This analysis underscores the promise of training histopathologic predictors directly on recurrence rather than clinical surrogates, and emphasizes the need for larger, collaborative analysis of breast cancer outcome where sufficient event sizes and inter-study comparisons can be made. The benefits of a histologic approach for risk stratification could be significant, particularly for low-grade patients where the current markers (grade and ER) are not sensitively capturing risk of recurrence.

Methods

Study population

CBCS3 is a prospective, population-based cohort of 2998 women with incident invasive breast cancer recruited from 44 counties in North Carolina between 2008 and 2013. First, primary breast cancer cases were identified using rapid case ascertainment in collaboration with the North Carolina Central Cancer Registry. Eligible women were between 20 and 74 years old. Black and young (<50 years old) women were oversampled to each represent 50% of the population. The study was approved by the University of North Carolina Institutional Review Board in accordance with U.S. Common Rule. All study participants provided written informed consent prior to study entry. This study complied with all relevant ethical regulations, including the Declaration of Helsinki.

Breast cancer recurrence was ascertained by patient self-report at annual telephone follow-ups and then confirmed by medical record. Formalin-fixed, paraffin-embedded (FFPE) tumor blocks were obtained from participating medical centers for participants with available tissue. Tumor blocks were obtained for 1743 of the women enrolled in the study and were reviewed by the study pathologist (JG). From tumor-enriched regions selected by the pathologist, between one and four 1-mm tumor cores were sampled and embedded in tissue microarrays (TMAs) at the Translational Pathology Laboratory at UNC-Chapel Hill. TMA slides were sectioned (5-μm thickness) and top and bottom sections were stained with hematoxylin and eosin (H&E) and scanned at 20x magnification. In the balanced dataset, the 202 participants corresponded with 704 H&E core images of approximately 3000 × 3000 pixels and each participant had between two and four 1-mm tumor cores.

Tumor grade was determined centrally by the study pathologist, except where whole slide images were unavailable for secondary review and the originally reported (“clinical”) tumor grade was used (n = 7) or where both slides and clinical grade were missing (n = 39). From 1743 women with TMA images, participants were excluded if missing grade or if tissue was insufficient for VGG16 CNN (i.e., core damaged or section folded, tumor was depleted, n = 30). Because this study was aimed at evaluating recurrence, women with metastatic disease at diagnosis were excluded (n = 40), resulting in a final eligible population of 1644 women, corresponding to a total of 5969 core images. Approximately 7% of the study population experienced an early recurrence (n = 101), defined as recurrence within three years of diagnosis. To construct a balanced dataset of recurrent and non-recurrent cases, we matched recurrent cases to non-recurrent participants 1:1 on age (defined in 5-year bins).

Gene expression assays

Where available, additional FFPE specimens from CBCS3 participants were obtained for RNA extraction. RNA was isolated using RNeasy FFPE Kits (Qiagen) and Nanostring gene expression assays were performed at UNC Chapel Hill in the Translational Genomics Lab. Gene expression data were cleaned and normalized as described previously³¹. Of the 1543 women included in the study, 1145 women had data on genes required for the PAM50 predictor, a research-only version of the Prosigna clinical assay. These genes were used to calculate a PAM50 risk-of-recurrence (ROR) score. For this study, the ROR-PT score was used, which additionally incorporates information on the PAM50 subtype, proliferation score (P), and tumor size (T)⁷. These scores were then categorized into low-intermediate and high risk. Data on the 21 genes included in the OncotypeDX score assay were available for 937 of the 1543 women on study. These genes were used to approximate OncotypeDX scores for these patients, which were then categorized into low-intermediate (<26) and high (26+) risk.

Model specification and validation

Image preprocessing and feature extraction

The appearance of core images was standardized in each color channel to have mean equal to zero and standard deviation equal to one. We used a CNN³² to extract feature representations of core images (n = 704 cores from 202 participants). The CNN first applies convolutional filters followed by pooling operations. Lower-level layers learn generic image features such as edges and shapes, intermediate level layers capture increasingly complex properties like shape and texture, and higher-level layers learn global concepts that describe the semantic meaning of the images^32,33. The parameters of such a network are the weights of the convolutional filters and are learned from the data in an adaptive manner, creating a hierarchically set of features of increasing abstraction.

We used the VGG16²³ network that was pre-trained on the ImageNet dataset, without alteration. We explored using other networks (e.g., Resnet), but VGG16 resulted in better cross-validation and test set accuracy. ImageNet³⁴ contains 1.2 million images, all of which belong to one of 1000 ImageNet object categories³⁵, and although the extant categories are different from histology images, the pre-trained weights transfer well to feature extraction from tissue sections. To use the pre-trained VGG16 network to extract features of the core images, we feed the core images as input into the VGG16 network. Each layer transforms the features obtained from the previous layer based on its parameters and learns concepts like shape and texture that are complex enough for generalization but not too specific to images in ImageNet as to be inapplicable to histology images. In principle, one can use the features at any layer as the feature representation of the input image. Similar to the approach used by Couture¹², we evaluated performance of the features extracted from different layers of the pre-trained VGG16 network. We ran a grid search over the feature extraction layers on 90% of the data (training set) and evaluated the performance on the remaining 10% (validation set). The 7th layer had highest validation accuracy and was therefore selected for model development. A total of 704 cores across 202 participants were used to extract a 512 × 64 × 64-dimensional matrix of image features. Spatial mean pooling was used to calculate to produce a feature vector of length 512 for each tumor core for use in prediction analysis.

We first attempted to use fully connected layers, but this resulted in overfitting. Therefore, we used a support vector machine (SVM) to classify the patient-level features that predict binary recurrence. An SVM²⁴ is a classification algorithm that finds a linear decision boundary to separate the two classes (here, recurrent vs non-recurrent). The foundational idea of SVM is that if one interprets the margin between a data point and the decision boundary as the difficulty of classifying that data point (i.e., the smaller the margin is, the more difficult it is to classify that data point), we seek an optimal decision boundary that maximizes the margin, thereby minimizing the classification difficulty. After locating the decision boundary, an SVM predicts the class assignments of the data points based on which side of the decision boundary those points are on. We used those predicted class assignments for further analysis of the recurrence prediction.

We also explored training a neural classifier for predicting early recurrence in an end-to-end fashion; however, due to the small sample size of our balanced data set, the trained deep network overfit the training data and poorly distinguished recurrences in the test set.

Validation datasets

Ideally, data is split into training, validation, and test sets to evaluate the accuracy of ML algorithms; however, the balanced data set (n = 202) was small, and therefore, we assessed performance via two methods. First, our goal was to have a single feature vector to represent each patient. To this end, we performed a cross-patient validation (Fig. 2), in which we averaged the feature vectors of multiple core images per patient to create a single patient-specific feature vector. We then performed ten-fold cross-validation to train and evaluate the model for predicting recurrence, with one-tenth held back for testing in each iteration. We confirmed that as we added additional folds for k-fold validation (i.e., as we increased k), we increased our accuracy from 56% at 2-fold to 62% at 10-fold. Second, we performed a within-patient validation (Fig. 2). In this method, we took advantage of multiple tumor cores for each patient and trained the model using the feature vectors on half of each patients’ cores, testing the model on the second half. This latter method assumes that the core images belonging to each patient are independent, which is unlikely given that the correlation between image features within a patient (average cosine similarity within individual patients’ image features = 0.91) was higher than correlations between patients (average cosine similarity between patients’ image features = 0.84). However, we were interested in estimating how ‘individuality’ of a given tumor contributes to predictive accuracy, and thus, we considered within-patient methods as an optimistic estimation of model performance and compared the results to those obtained by cross-validation.

**Fig. 2: Workflow for cross-patients validation (left) and within-patients validation (right) approaches.**

Statistical analysis

The predictive value of image-based classes was assessed using sensitivity, specificity, and overall accuracy of prediction. 95% confidence intervals were produced for these measures using the normal approximation of a binomial proportion. Following development of a binary classification scheme, we performed time-to-event analyses to assess relationships between the SVM-derived image classes and recurrence within the full cohort. Cox proportional hazards models were used to estimate hazard ratios and 95% confidence intervals. Relationships between image-based classes and recurrence were visualized with Kaplan–Meier curves. Generalized linear models with identity link and binomial family were used to estimate relative frequency differences (RFDs) and 95% confidence intervals to describe associations between image classes and molecular risk scores (OncotypeDX and ROR-PT).

Data availability

The data analyzed in this study are available from the Carolina Breast Cancer Study (https://unclineberger.org/cbcs/). Restrictions apply to the availability of these data, which were used under data use agreements for this study. Data is not publicly available; however, investigators may submit a letter of intent to gain access upon reasonable request.

References

Colleoni, M. et al. Annual hazard rates of recurrence for breast cancer during 24 years of follow-up: results from the international breast cancer study group trials I to V. J. Clin. Oncol. 34, 927–935 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wapnir, I. L. et al. Prognosis after ipsilateral breast tumor recurrence and locoregional recurrences in five National Surgical Adjuvant Breast and Bowel Project node-positive adjuvant breast cancer trials. J. Clin. Oncol. 24, 2028–2037 (2006).
Article PubMed Google Scholar
Anderson, S. J. et al. Prognosis after ipsilateral breast tumor recurrence and locoregional recurrences in patients treated by breast-conserving therapy in five National Surgical Adjuvant Breast and Bowel Project protocols of node-negative breast cancer. J. Clin. Oncol. 27, 2466–2473 (2009).
Article PubMed PubMed Central Google Scholar
Dent, R. et al. Factors associated with breast cancer mortality after local recurrence. Curr. Oncol. 21, 418–425 (2014).
Article Google Scholar
Harris, L. N. et al. Use of biomarkers to guide decisions on adjuvant systemic therapy for women with early-stage invasive breast cancer: American Society of Clinical Oncology Clinical Practice Guideline. J. Clin. Oncol. 34, 1134–1150 (2016).
Article CAS PubMed PubMed Central Google Scholar
Paik, S. et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N. Engl. J. Med. 351, 2817–2826 (2004).
Article CAS PubMed Google Scholar
Parker, J. S. et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. https://doi.org/10.1200/JCO.2008.18.1370 (2009).
Wallden, B. et al. Development and verification of the PAM50-based Prosigna breast cancer gene signature assay. BMC Med. Genomics 8, 54 (2015).
Article PubMed PubMed Central Google Scholar
Guler, E. N. Gene expression profiling in breast cancer and its effect on therapy selection in early-stage breast cancer. Eur. J. Breast Health 13, 168–174 (2017).
Article PubMed PubMed Central Google Scholar
Khan, A., Sohail, A., Zahoora, U. & Qureshi, A. S. A survey of the recent architectures of deep convolutional neural networks. Artif. Intell. Rev. 53, 5455–5516 (2020).
Article Google Scholar
Shen, L. et al. Deep learning to improve breast cancer detection on screening mammography. Sci. Rep. 9, 12495 (2019).
Article PubMed PubMed Central Google Scholar
Couture, H. D. et al. Image analysis with deep learning to predict breast cancer grade, ER status, histologic subtype, and intrinsic subtype. Npj Breast Cancer 4, 30 (2018).
Article PubMed PubMed Central Google Scholar
Whitney, J. et al. Quantitative nuclear histomorphometry predicts Oncotype DX risk categories for early stage ER+ breast cancer. BMC Cancer 18, 610 (2018).
Article PubMed PubMed Central Google Scholar
Abubakar, M. et al. Relation of quantitative histologic and radiologic breast tissue composition metrics with invasive breast cancer risk. JNCI Cancer Spectr. 5, pkab015 (2021).
Article PubMed PubMed Central Google Scholar
El Agouri, H. et al. Assessment of deep learning algorithms to predict histopathological diagnosis of breast cancer: first Moroccan prospective study on a private dataset. BMC Res. Notes 15, 66 (2022).
Article CAS PubMed PubMed Central Google Scholar
Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019).
Article CAS PubMed PubMed Central Google Scholar
Das, A. et al. Detection of breast cancer from mammogram images using deep transfer learning. In Advances in Signal Processing and Intelligent Recognition Systems 18–27. https://doi.org/10.1007/978-981-16-0425-6_2 (2021).
Das, H. S. et al. Breast cancer detection: shallow convolutional neural network against deep convolutional neural networks based approach. Front. Genet. 13, 1097207 (2023).
Article PubMed PubMed Central Google Scholar
Klimov, S. et al. A whole slide image-based machine learning approach to predict ductal carcinoma in situ (DCIS) recurrence risk. Breast Cancer Res. 21, 83 (2019).
Article PubMed PubMed Central Google Scholar
Turkki, R. et al. Breast cancer outcome prediction with tumour tissue images and machine learning. Breast Cancer Res. Treat. 177, 41–52 (2019).
Article PubMed PubMed Central Google Scholar
Wulczyn, E. et al. Deep learning-based survival prediction for multiple cancer types using histopathology images. PLoS One 15, e0233678 (2020).
Article CAS PubMed PubMed Central Google Scholar
Fassler, D. J. et al. Spatial characterization of tumor-infiltrating lymphocytes and breast cancer progression. Cancers 14, 2148 (2022).
Article CAS PubMed PubMed Central Google Scholar
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. Preprint at https://doi.org/10.48550/arXiv.1409.1556 (2015).
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning (Springer, 2009). https://doi.org/10.1007/978-0-387-84858-7.
Puig, C. A., Hoskin, T. L., Day, C. N., Habermann, E. B. & Boughey, J. C. National trends in the use of neoadjuvant chemotherapy for hormone receptor-negative breast cancer: a national cancer data base study. Ann. Surg. Oncol. 24, 1242–1250 (2017).
Article PubMed Google Scholar
Lou, S.-J. et al. Machine learning algorithms to predict recurrence within 10 years after breast cancer surgery: a prospective cohort study. Cancers 12, 3817 (2020).
Article PubMed PubMed Central Google Scholar
Kim, J.-Y. et al. Deep learning-based prediction model for breast cancer recurrence using adjuvant breast cancer cohort in tertiary cancer center registry. Front. Oncol. 11, 596364 (2021).
Article CAS PubMed PubMed Central Google Scholar
Montazeri, M., Montazeri, M., Montazeri, M. & Beigzadeh, A. Machine learning models in breast cancer survival prediction. Technol. Health Care 24, 31–42 (2016).
Article PubMed Google Scholar
Chen, H., Gao, M., Zhang, Y., Liang, W. & Zou, X. Attention-Based Multi-NMF Deep Neural Network with Multimodality Data for Breast Cancer Prognosis Model. BioMed. Res. Int. 2019, 1–11 (2019).
Google Scholar
Perou, C. M. et al. Molecular portraits of human breast tumours. Nature 406, 747–752 (2000).
Article CAS PubMed Google Scholar
Bhattacharya, A. et al. An approach for normalization and quality control for NanoString RNA expression data. Brief. Bioinform. 22, bbaa163 (2021).
Article PubMed Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2017).
Article Google Scholar
Hossain, M. Z., Sohel, F., Shiratuddin, M. F. & Laga, H. A comprehensive survey of deep learning for image captioning. ACM Computing Surveys 51, 1–36 (2019).
Article CAS Google Scholar
Deng, J. et al. ImageNet: a large-scale hierarchical image database. in 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009). https://doi.org/10.1109/CVPR.2009.5206848.
Talo, M. Convolutional Neural Networks for Multi-class Histopathology Image Classification. ArXiv abs/1903.10035, (2009).

Download references

Acknowledgements

We are grateful to CBCS participants, without whose selfless participation, this research would not be possible. We are also grateful to the CBCS staff for all the work they do in abstracting and maintaining the study data. This research was supported by a grant from UNC Lineberger Comprehensive Cancer Center, which is funded by the University Cancer Research Fund of North Carolina, the Susan G Komen Foundation (OGUNC1202), the National Cancer Institute of the National Institutes of Health (P01CA151135), and the National Cancer Institute Specialized Program of Research Excellence (SPORE) in Breast Cancer (NIH/NCI P50-CA058223). This research recruited participants &/or obtained data with the assistance of Rapid Case Ascertainment, a collaboration between the North Carolina Central Cancer Registry and UNC Lineberger. RCA is supported by a grant from the National Cancer Institute of the National Institutes of Health (P30CA016086). The authors would like to acknowledge the University of North Carolina BioSpecimen Processing Facility for sample processing, storage, and sample disbursements (http://bsp.web.unc.edu/). This work was further supported by U54 CA156733, as well as Susan G. Komen grants TREND21686258 and SAC210102. Research of J.S.M. is partially supported by NSF Grant DMS-2113404.

Author information

These authors contributed equally: Yifeng Shi, Linnea T. Olsson.
These authors jointly supervised this work: Marc Niethammer, Melissa A. Troester.

Authors and Affiliations

Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Yifeng Shi, J. S. Marron & Marc Niethammer
Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Linnea T. Olsson & Melissa A. Troester
Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Katherine A. Hoadley, Benjamin C. Calhoun, J. S. Marron & Melissa A. Troester
Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Katherine A. Hoadley
Department of Pathology and Laboratory Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Benjamin C. Calhoun & Melissa A. Troester
Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
J. S. Marron
Department of Pathology, East Carolina University, Greenville, NC, USA
Joseph Geradts

Authors

Yifeng Shi
View author publications
You can also search for this author in PubMed Google Scholar
Linnea T. Olsson
View author publications
You can also search for this author in PubMed Google Scholar
Katherine A. Hoadley
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin C. Calhoun
View author publications
You can also search for this author in PubMed Google Scholar
J. S. Marron
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Geradts
View author publications
You can also search for this author in PubMed Google Scholar
Marc Niethammer
View author publications
You can also search for this author in PubMed Google Scholar
Melissa A. Troester
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.A.T., M.N. and K.A.H. contributed to the development of the concept for the study and provided supervision and feedback throughout the study process. J.G., B.C.C. and J.S.M. provided technical and methodological support. Y.S. and L.T.O. collected and analyzed data, then drafted the manuscript. Y.S. and L.T.O. are co-first authors for this manuscript. All authors contributed to reviewing and editing the manuscript.

Corresponding author

Correspondence to Melissa A. Troester.

Ethics declarations

Competing interests

The University of North Carolina, Chapel Hill has a license of intellectual property interest in GeneCentric Diagnostics and BioClassifier, LLC, which may be used in this study. The University of North Carolina, Chapel Hill may benefit from this interest that is/are related to this research. The terms of this arrangement have been reviewed and approved by the University of North Carolina, Chapel Hill Conflict of Interest Program in accordance with its conflict-of-interest policies. B.C.C. was a member of the Oncology Advisory Board for Luminex Corporation (9/2019-7/2022).

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Shi, Y., Olsson, L.T., Hoadley, K.A. et al. Predicting early breast cancer recurrence from histopathological images in the Carolina Breast Cancer Study. npj Breast Cancer 9, 92 (2023). https://doi.org/10.1038/s41523-023-00597-0

Download citation

Received: 22 November 2022
Accepted: 20 October 2023
Published: 11 November 2023
DOI: https://doi.org/10.1038/s41523-023-00597-0