Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Feb 26, 2025
Open Peer Review Period: Feb 26, 2025 - Apr 23, 2025
Date Accepted: Apr 17, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Development of a 5-Year Risk Prediction Model for Transition From Prediabetes to Diabetes Using Machine Learning: Retrospective Cohort Study

Zhang Y, Zhang H, Li N, Lv H, Wang D, Zhang G

Development of a 5-Year Risk Prediction Model for Transition From Prediabetes to Diabetes Using Machine Learning: Retrospective Cohort Study

J Med Internet Res 2025;27:e73190

DOI: 10.2196/73190

PMID: 40344663

PMCID: 12102623

Development of a 5-Year Risk Prediction Model for Transition from Prediabetes to Diabetes Using Machine Learning: Retrospective Cohort Study

  • Yongsheng Zhang; 
  • Hongyu Zhang; 
  • Na Li; 
  • Haoyue Lv; 
  • Dawei Wang; 
  • Guang Zhang

ABSTRACT

Background:

Patients with prediabetes can easily progress to diabetes.

Objective:

We aimed to develop a 5-year risk prediction model of progression from prediabetes to diabetes for the Chinese population.

Methods:

A retrospective cohort study was conducted on 2 prediabetes cohorts, who were tracked from 2019 to 2024. Patients were split into the training (70%) and test (30%) sets randomly in the primary cohort. Significant predictors were selected on the training set, followed by the application of 7 machine learning algorithms, namely logistic regression, random forest, support vector machine, multilayer perceptron, XGBoost, LightGBM, and CatBoost, to develop prediction models. Model performance was assessed using ROC, the precision-recall curves as well as multiple other metrics on both of the test set and the external test set.

Results:

The average annual conversion rate from prediabetes to diabetes was 8.34% and 7.04% in the primary cohort and the external cohort, respectively. Utilizing 14 features, the CatBoost model excelled in the test set and the external test set with an AUC of 0.819 and 0.807, respectively. It also had the highest discrimination performance across several other metrics, and presented outstanding calibration performances.

Conclusions:

We developed a 5-year risk prediction model of progression from prediabetes to diabetes for the Chinese population, with the CatBoost model showing the best predictive performance, could effectively identify individuals at high risk of diabetes.


 Citation

Please cite as:

Zhang Y, Zhang H, Li N, Lv H, Wang D, Zhang G

Development of a 5-Year Risk Prediction Model for Transition From Prediabetes to Diabetes Using Machine Learning: Retrospective Cohort Study

J Med Internet Res 2025;27:e73190

DOI: 10.2196/73190

PMID: 40344663

PMCID: 12102623

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.