JMIR Preprints #87819: Applications of AutoML in Diabetes Risk Prediction: A Rapid Review of Methodological Approaches and Reported Performance (2015

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Applications of AutoML in Diabetes Risk Prediction: A Rapid Review of Methodological Approaches and Reported Performance (2015–2025

Alexandre Castonguay;
Sandrine Hegg-Deloye;
Arthur Chatton;
Amélie Goyette

ABSTRACT

Background:

Type 2 diabetes (T2D) is a complex chronic condition that imposes a substantial burden on healthcare systems. Prevention and early detection are critical to mitigating its impact. Automated machine learning (AutoML) models have the potential to predict individual risk and guide personalized interventions. However, their clinical deployment remains limited due to the retrospective nature of most datasets, lack of external validation, and heterogeneity in variable selection

Objective:

To map AutoML approaches applied to T2D risk prediction, with a specific focus on their ability to integrate clinical, behavioral, environmental, and genomic data.

Methods:

A PRISMA-guided rapid review was conducted across six databases (PubMed, Scopus, Web of Science, IEEE Xplore, Google Scholar, and EMBASE) to identify empirical studies (2015–2025) that used AutoML tools for T2D prediction based on at least two data types (e.g., clinical, behavioral, environmental, genomic). Screening, data extraction, and synthesis were performed systematically by two independent reviewers, with arbitration by a third AI reviewer (ChatGPT).

Results:

Thirteen studies met inclusion criteria. Methodological diversity ranged from conventional machine learning with manual feature selection to partially or fully automated pipelines using tools such as TPOT, H2O AutoML, or Azure ML. Reported performance varied (AUC 0.75–0.99), but external validation was uncommon. Behavioral and environmental data were only partially integrated, and no study incorporated genomic data despite its recognized potential. Most studies lacked transparency and reproducibility, with no public code or pipeline sharing

Conclusions:

AutoML holds significant promise for improving T2D risk prediction through automation and model explainability. Yet, to support clinical adoption and generalizability, future AutoML pipelines must be developed using prospective, multicenter datasets, integrate diverse and harmonized data types, including genomics, and adhere to open science principles of transparency, reproducibility, and interpretability

Citation

Please cite as:

Castonguay A, Hegg-Deloye S, Chatton A, Goyette A

Methodological Approaches to and Reported Performance of Applications of Automated Machine Learning in Diabetes Risk Prediction: Rapid Review

JMIR AI 2026;5:e87819

DOI: 10.2196/87819

PMID: 42119064

PMCID: 13167060

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR AI

Date Submitted: Nov 19, 2025

Open Peer Review Period: Dec 3, 2025 - Jan 28, 2026

Date Accepted: Mar 4, 2026

(closed for review but you can still tweet)

Applications of AutoML in Diabetes Risk Prediction: A Rapid Review of Methodological Approaches and Reported Performance (2015–2025

ABSTRACT

Citation

Copyright