Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Mar 6, 2024
Date Accepted: Dec 3, 2024

The final, peer-reviewed published version of this preprint can be found here:

Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development Study

Kim S, Park J, Son Y, Lee H, Woo S, Lee M, Lee H, Sang H, Yon DK, Rhee SY

Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development Study

JMIR Med Inform 2025;13:e58107

DOI: 10.2196/58107

PMID: 39924304

PMCID: 11830482

A machine learning algorithm for predicting diabetes retinopathy in patients with type 2 diabetes: Derivation and validation in two independent cohorts in South Korea

  • Sunyoung Kim; 
  • Jaeyu Park; 
  • Yejun Son; 
  • Hojae Lee; 
  • Selin Woo; 
  • Myeongcheol Lee; 
  • Hayeon Lee; 
  • Hyunji Sang; 
  • Dong Keon Yon; 
  • Sang Youl Rhee

ABSTRACT

Background:

Diabetic retinopathy (DR) is the leading cause of preventable blindness worldwide. Machine learning (ML) systems show potential to enhance DR in community-based screening. However, predictive power models assessing their usability and performance are scarce.

Objective:

This study used data from three university hospitals in Korea to provide a simple and accurate assessment of ML-based risk prediction for DR development, which can be universally applied to adults with type 2 diabetes mellitus (T2DM).

Methods:

This study predicted DR using data from independent electronic medical record-based cohorts; namely, a discovery cohort (one hospital, n=68,009) and a validation cohort (two hospitals, n=18,895). The primary outcome was the presence or absence of DR at three years. Different ML-based models were selected through hyperparameter tuning in the discovery cohort and analyzed the area under the receiver operating characteristic curve in the validation cohort.

Results:

Among 68,009 patients screened for inclusion, 14,694 (21.61%) were eligible for study analysis, and 348 (2.37%) patients were referred for DR. For DR, the XGBoost system had an accuracy of 73.10% (95% confidence interval [CI], 71.27–74.93), with a sensitivity of 72.71% (71.03–74.39) and a specificity of 73.11% (71.27–74.94) in the original dataset. Among the validation data set, XGBoost had an accuracy of 66.86%, a sensitivity of 67.15%, and a specificity of 66.84%. The most common feature in the XGBoost model was dyslipidemia, followed by cancer, hypertension, chronic kidney disease, neuropathy, and cardiovascular disease.

Conclusions:

Among 68,009 patients screened for inclusion, 14,694 (21.61%) were eligible for study analysis, and 348 (2.37%) patients were referred for DR. For DR, the XGBoost system had an accuracy of 73.10% (95% confidence interval [CI], 71.27–74.93), with a sensitivity of 72.71% (71.03–74.39) and a specificity of 73.11% (71.27–74.94) in the original dataset. Among the validation data set, XGBoost had an accuracy of 66.86%, a sensitivity of 67.15%, and a specificity of 66.84%. The most common feature in the XGBoost model was dyslipidemia, followed by cancer, hypertension, chronic kidney disease, neuropathy, and cardiovascular disease.


 Citation

Please cite as:

Kim S, Park J, Son Y, Lee H, Woo S, Lee M, Lee H, Sang H, Yon DK, Rhee SY

Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development Study

JMIR Med Inform 2025;13:e58107

DOI: 10.2196/58107

PMID: 39924304

PMCID: 11830482

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.