Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Feb 26, 2025
Open Peer Review Period: Feb 26, 2025 - Apr 23, 2025
Date Accepted: Apr 17, 2025
(closed for review but you can still tweet)
Development of a 5-Year Risk Prediction Model for Transition from Prediabetes to Diabetes Using Machine Learning: Retrospective Cohort Study
ABSTRACT
Background:
Patients with prediabetes can easily progress to diabetes.
Objective:
We aimed to develop a 5-year risk prediction model of progression from prediabetes to diabetes for the Chinese population.
Methods:
A retrospective cohort study was conducted on 2 prediabetes cohorts, who were tracked from 2019 to 2024. Patients were split into the training (70%) and test (30%) sets randomly in the primary cohort. Significant predictors were selected on the training set, followed by the application of 7 machine learning algorithms, namely logistic regression, random forest, support vector machine, multilayer perceptron, XGBoost, LightGBM, and CatBoost, to develop prediction models. Model performance was assessed using ROC, the precision-recall curves as well as multiple other metrics on both of the test set and the external test set.
Results:
The average annual conversion rate from prediabetes to diabetes was 8.34% and 7.04% in the primary cohort and the external cohort, respectively. Utilizing 14 features, the CatBoost model excelled in the test set and the external test set with an AUC of 0.819 and 0.807, respectively. It also had the highest discrimination performance across several other metrics, and presented outstanding calibration performances.
Conclusions:
We developed a 5-year risk prediction model of progression from prediabetes to diabetes for the Chinese population, with the CatBoost model showing the best predictive performance, could effectively identify individuals at high risk of diabetes.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.