Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Nov 29, 2025
Date Accepted: Mar 12, 2026

The final, peer-reviewed published version of this preprint can be found here:

Transfer Learning and Machine Learning for Training Five-Year Survival Prognostic Models in Early Breast Cancer: Development and Validation Study

Pilgram L, Yang K, Beltran-Bless AA, Pond G, Vandermeer L, Hilton J, Savard MF, LeBlanc A, Sheperd L, Chen B, Bartlett J, Taylor K, Bayani J, Barker S, Spears M, van de Velde C, Kranenbarg E, Dirix L, Mallon E, Hasenburg A, Markopoulos C, Juwara L, Dankar F, Clemons M, El Emam K

Transfer Learning and Machine Learning for Training Five-Year Survival Prognostic Models in Early Breast Cancer: Development and Validation Study

J Med Internet Res 2026;28:e88665

DOI: 10.2196/88665

PMID: 41980703

Transfer Learning and Machine Learning for Training Five Year Survival Prognostic Models in Early Breast Cancer: Development and Validation Study

  • Lisa Pilgram; 
  • Kai Yang; 
  • Ana-Alicia Beltran-Bless; 
  • Gregory Pond; 
  • Lisa Vandermeer; 
  • John Hilton; 
  • Marie-France Savard; 
  • Andreanne LeBlanc; 
  • Lois Sheperd; 
  • Bingshu Chen; 
  • John Bartlett; 
  • Karen Taylor; 
  • Jane Bayani; 
  • Sarah Barker; 
  • Melanie Spears; 
  • Cornelis van de Velde; 
  • Elma Kranenbarg; 
  • Luc Dirix; 
  • Elizabeth Mallon; 
  • Annette Hasenburg; 
  • Christos Markopoulos; 
  • Lamin Juwara; 
  • Fida Dankar; 
  • Mark Clemons; 
  • Khaled El Emam

ABSTRACT

Background:

Prognostic information is essential for decision-making in breast cancer management. In recent years, trials and clinical practice have predominantly focused on genomic prognostication tools, even though clinicopathological prognostication is less costly and more widely accessible. PREDICT v3 is an example of a clinicopathological prognostication tool that has shown promising results across multiple cohorts. Advances in machine learning (ML), transfer learning and ensemble integrations now offer opportunities to strengthen such approaches, particularly in contexts where missingness and model assumptions vary across cohorts.

Objective:

This study evaluates the potential to improve survival prognostication in breast cancer, more precisely we compare de-novo ML, transfer learning from the pre-trained prognostication model PREDICT v3 and a stacked ensemble approach.

Methods:

Data from the MA.27 trial (NCT00066573) was used for model training, with external validation on data from the TEAM trial (NCT00279448, NCT00032136) and a SEER cohort. Transfer learning was applied by re-estimating the parameters of fine-tuning the pre-trained prognostic tool PREDICT v3. De-novo ML included Random Survival Forests (RSF) and Extreme Gradient Boosting (XGB), and the ensemble was implemented using weighted linear stacking integration was realized through a weighted sum of model predictions. Internal and external validation was assessed in terms of the Integrated Calibration Index (ICI) and discrimination (area under the receiver operating characteristic curve, AUROC). Shapley Additive Explanations (SHAP) were used to explain model predictions and decision curve analysis (DCA) to facilitate interpretation of performance differences.

Results:

Transfer learning, de-novo RSF, and the stacked ensemble integration improved calibration in MA.27 over the pre-trained model (ICI reduced from 0.042 in PREDICT v3 to ≤0.007) while discrimination remained comparable (AUROC increased from 0.738 in PREDICT v3 to 0.744-0.799). In DCA, these approaches demonstrated consistently positive net benefit across clinically relevant thresholds, while PREDICT v3 lost net benefit beyond 7.5% predicted risk. Invalid PREDICT v3 predictions were observed in 23.8-25.8% of MA.27 individuals due to missing information. In contrast, ML models and the stacked ensemble integration could predict survival regardless of missing information. Across all models, patient age, nodal status, pathological grading and tumor size had consistently highest SHAP values, indicating their importance for survival prognostication. External validation in SEER, but not in TEAM, confirmed the benefits of transfer learning, RSF and ensemble integration in terms of calibration while maintaining discrimination at comparable levels. In contrast, generalizability was limited in TEAM, a cohort with substantially different distribution of clinicopathological characteristics.

Conclusions:

This study demonstrates that transfer learning, de-novo RSF, and a stacked ensemble integration can improve prognostication compared with the pretrained PREDICT v3, particularly in the presence of missing or uncertain inputs. Transportability may be limited in cohorts with a different clinicopathological profile, highlighting the need for local validation prior to clinical deployment. in situations where relevant information for PREDICT v3 is lacking or where a dataset shift is likely. Ultimately, better survival estimation can provide meaningful guidance in breast cancer treatment, supporting a more targeted, cost-effective, and personalized approach to breast cancer care. Clinical Trial: NCT00066573; NCT00279448; NCT00032136


 Citation

Please cite as:

Pilgram L, Yang K, Beltran-Bless AA, Pond G, Vandermeer L, Hilton J, Savard MF, LeBlanc A, Sheperd L, Chen B, Bartlett J, Taylor K, Bayani J, Barker S, Spears M, van de Velde C, Kranenbarg E, Dirix L, Mallon E, Hasenburg A, Markopoulos C, Juwara L, Dankar F, Clemons M, El Emam K

Transfer Learning and Machine Learning for Training Five-Year Survival Prognostic Models in Early Breast Cancer: Development and Validation Study

J Med Internet Res 2026;28:e88665

DOI: 10.2196/88665

PMID: 41980703

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.