JMIR Preprints #19069: A Personalized Prognostic Model for Early Invasive Breast Cancer by Machine-Learning Multidimensional Data: A Population-based Cohort Study in China

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

A Personalized Prognostic Model for Early Invasive Breast Cancer by Machine-Learning Multidimensional Data: A Population-based Cohort Study in China

Xiaorong Zhong;
Ting Luo;
Ling Deng;
Pei Liu;
Kejia Hu;
Donghao Lu;
Dan Zheng;
Chuanxu Luo;
Yuxin Xie;
Jiayuan Li;
Ping He;
Tianjie Pu;
Feng Ye;
Hong Bu;
Bo Fu;
Hong Zheng

ABSTRACT

Background:

Current online prognostic prediction models for breast cancer, Adjuvant online and PREDICT, are mainly based on specific populations. They have been well validated and widely used in the United States and Western Europe. However, several validation attempts in non-European countries revealed sub-optimal predictions.

Objective:

We aimed to develop an advanced breast cancer prognosis model for disease progression, cancer-specific mortality, and all-cause mortality by integrating tumor, demographic, and treatment characteristics based on a large breast cancer cohort in China.

Methods:

This study was approved by the Clinical Test and Biomedical Ethics Committee of West China Hospital, Sichuan University at date May 17, 2012. Data collection for this project was started at May 2017 and ended at March 2019. Data on 5,293 women diagnosed with stage I–III invasive breast cancer between 2000 and 2013 were collected. The endpoints were disease progression, cancer-specific mortality, and all-cause mortality, and the likelihood of disease progression or death within a 5-year period was predicted. Machine learning method XGBoost was used to develop the prediction model. The model performance was assessed by calculating the area under the curve (AUC), followed by calibration and comparison with PREDICT.

Results:

The training, test, and validation populations comprised 3,276 (499 progressions, 202 breast cancer-specific deaths, and 261 all-cause deaths within 5-year follow-up), 1,405 (211 progressions, 94 breast cancer-specific deaths, and 129 all-cause deaths), and 612 (109 progressions, 33 breast cancer-specific deaths, and 37 all-cause deaths) women, respectively. The AUCs for disease progression, cancer-specific mortality, and all-cause mortality were 0.76, 0.88, and 0.82 in the training; 0.79, 0.80, and 0.83 in the test; and 0.79, 0.84, and 0.88 in the validation population, respectively. Calibration analysis demonstrated good agreement between the predicted and observed events within 5 years. Comparable AUCs and calibrations were confirmed in subgroups of different ages, residence statuses, and receptor statuses. Compared with PREDICT, our model showed similar AUCs and improved calibrations.

Conclusions:

Our integrative prognostic model exhibits high discrimination and good calibration. It may facilitate prognosis prediction and clinical decision making for Chinese breast cancer patients.

Citation

Please cite as:

Zhong X, Luo T, Deng L, Liu P, Hu K, Lu D, Zheng D, Luo C, Xie Y, Li J, He P, Pu T, Ye F, Bu H, Fu B, Zheng H

Multidimensional Machine Learning Personalized Prognostic Model in an Early Invasive Breast Cancer Population-Based Cohort in China: Algorithm Validation Study

JMIR Med Inform 2020;8(11):e19069

DOI: 10.2196/19069

PMID: 33164899

PMCID: 7683252

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Apr 5, 2020

Date Accepted: Sep 16, 2020

A Personalized Prognostic Model for Early Invasive Breast Cancer by Machine-Learning Multidimensional Data: A Population-based Cohort Study in China

ABSTRACT

Citation

Copyright