Accepted for/Published in: JMIR AI
Date Submitted: Oct 3, 2025
Open Peer Review Period: Oct 3, 2025 - Nov 28, 2025
Date Accepted: Apr 28, 2026
(closed for review but you can still tweet)
Prediction of Type 2 Diabetes Mellitus from chest x-rays using a suite of previously developed chronic disease deep learning models in an ethnically diverse cohort
ABSTRACT
Background:
Screening for Type2 Diabetes (T2D) is suboptimal, leaving many patients undiagnosed. Deep learning (DL) applied to chest radiographs (CXR) has shown promise for opportunistic T2D prediction, with prior work in a predominantly suburban white cohort achieving an AUC of 0.84. We evaluated performance in a diverse, urban population with higher minority representation, greater social deprivation, and higher T2D prevalence.
Objective:
To evaluate the performance and generalizability of a chest radiograph–based deep learning model for predicting Type 2 Diabetes prevalence and incidence in a diverse, urban population with high minority representation and social deprivation.
Methods:
We analyzed adults with ambulatory CXRs (2010–2020) from a tertiary academic medical center in Chicago. Demographics, body mass index (BMI), hemoglobinA1c, diabetes medications, and residential ZipCode were collected. T2D prevalence was modeled using XGBoost, and incidence with Cox Proportional Hazards. Model performance was compared using AUC, and predictors assessed using feature importance and odds ratios.
Results:
Among 39,811 patients (53.5% Black, 23% Latino, 14% White), 25.4% had T2D at first CXR. The DL-based model with demographics and BMI achieved an AUC of 0.807 [0.797-0.817], significantly outperforming the model without DL (AUC 0.759, P < 0.0001). T2D prevalence AUC was similar across racial groups (Latino 0.818, White 0.819, Black 0.790). The Social Deprivation Index (SDI) had no association with prevalence and minimal impact on incidence. The prediction of incidence at 3 years was 0.71 [0.68-0.74]. The highest quartile of DL-predicted risk had a 7-fold higher incidence than the lowest.
Conclusions:
CXR-based DL enables opportunistic T2D screening in diverse, urban populations, with strong prevalence prediction across racial groups. Incidence prediction was less accurate than in the derivation cohort (0.71 vs. 0.79), highlighting the need to refine models for longitudinal risk.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.