JMIR Preprints #72938: Gestational Diabetes Diagnoses in Electronic Health Records: A Three-Step Study of Label Accuracy and Its Impact on Machine Learning Models for Early Prediction

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Gestational Diabetes Diagnoses in Electronic Health Records: A Three-Step Study of Label Accuracy and Its Impact on Machine Learning Models for Early Prediction

Mark Germaine;
Amy C O'Higgins;
Brendan Egan;
Graham Healy

ABSTRACT

Background:

Integration of electronic health records (EHRs) into clinical research offers numerous opportunities for advancing healthcare delivery and patient outcomes, particularly in the era of machine learning (ML). However, EHR data needs to be coded accurately to ensure that models are learning correct representations of diseases.

Objective:

This study examines the accuracy of gestational diabetes mellitus (GDM) diagnoses in EHRs compared with a clinical team database (CTD) and their impact on ML models.

Methods:

EHRs from 2018-2022 were validated against CTD data to identify true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). Logistic regression (LR) models were trained and tested using both EHR and validated labels, whereafter simulated label noise was introduced to increase FP and FN rates. Model performance was assessed using Receiver Operating Characteristic Area Under the Curve (ROC-AUC) and average precision (AP).

Results:

Among 3,952 patients, 3,388 (85.7%) were correctly identified with GDM in both databases, while 564 cases lacked a GDM label in EHRs and 771 were missing a corresponding CTD label. Overall, 87.5% of cases were TN, 9.0% TP, 2.0% FP, and 1.5% FN. The model trained and tested with validated labels achieved a ROC-AUC of 0.817 and an AP of 0.450, whereas the same model tested using EHR labels achieved 0.814 and 0.395, respectively. Increased label noise during training led to gradual declines in ROC-AUC and AP, while noise in the test set -- especially elevated FP rates -- resulted in marked performance drops.

Conclusions:

Discrepancies between EHR and CTD diagnoses had limited impact on model training but significantly affected performance evaluation when present in the test set, emphasising the importance of accurate data validation.

Citation

Please cite as:

Germaine M, O'Higgins AC, Egan B, Healy G

Label Accuracy in Electronic Health Records and Its Impact on Machine Learning Models for Early Prediction of Gestational Diabetes: 3-Step Retrospective Validation Study

JMIR Med Inform 2025;13:e72938

DOI: 10.2196/72938

PMID: 40854223

PMCID: 12377786

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Feb 21, 2025

Open Peer Review Period: Mar 3, 2025 - Apr 28, 2025

Date Accepted: Jun 17, 2025

(closed for review but you can still tweet)

Gestational Diabetes Diagnoses in Electronic Health Records: A Three-Step Study of Label Accuracy and Its Impact on Machine Learning Models for Early Prediction

ABSTRACT

Citation

Copyright