Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Nov 18, 2022
Date Accepted: Sep 29, 2023
Prediction of Survival for Colorectal Cancer Using Time-to-Event Machine Learning: Retrospective Cohort Study
ABSTRACT
Background:
Machine learning (ML) methods have shown great potential in colorectal cancer (CRC) survival prediction. However, most of the ML models proposed so far have only considered outcomes as binary classes, rather than dynamic variables that generate outcomes as a probability of transformation over time.
Objective:
This study aims to evaluate the performance of ML approaches for modeling time-to-event survival data and develop an explainable model for predicting CRC-specific survival.
Methods:
A retrospective cohort of 2,157 CRC patients was collected from the Colorectal Cancer Database of West China Hospital, Sichuan University. We assessed the performance of six ML models including random survival forests (RSF), gradient boosting machines (GBM), DeepSurv, DeepHit, Cox-Time, and neural multi-task logistic regression (N-MTLR) by time-dependent concordance index (Ctd) and integrated Brier score (IBS) in predicting CRC-specific survival. Multivariable analysis and clinical experience were used to select significant features associated with CRC survival. Finally, the SHapley Additive exPlanations (SHAP) was applied to explain how the best performing model predicted 5-year CRC-specific survival.
Results:
All the time-to-event ML models outperformed the traditional Cox Proportional Hazards model in both discrimination and calibration abilities, and the DeepSurv model demonstrated the best discriminative ability (Ctd 0.810, 95% CI: 0.798, 0.828) and calibration ability (IBS 0.084, 95% CI: 0.078, 0.088). The SHAP method revealed that R0 resection, TNM staging, PLN, and age were important factors for 5-year CRC-specific survival.
Conclusions:
The time-to-event ML models accurately predict CRC-specific survival with DeepSurv demonstrating the best discriminative ability and calibration ability. The combination of time-to-event ML and SHAP can explain CRC-specific survival predictions.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.