JMIR Preprints #58085: Addressing Information Biases within Electronic Health Record Data to Improve Examination of Epidemiologic Associations with Diabetes Prevalence among Young Adults: Cross-Sectional Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Addressing Information Biases within Electronic Health Record Data to Improve Examination of Epidemiologic Associations with Diabetes Prevalence among Young Adults: Cross-Sectional Study

Sarah Conderino;
Rebecca Anthopolos;
Sandra S. Albrecht;
Shannon M. Farley;
Jasmin Divers;
Andrea R. Titus;
Lorna E. Thorpe

ABSTRACT

Background:

Electronic health records (EHRs) are increasingly used for epidemiologic research to advance public health practice. However, key variables are susceptible to missing data or misclassification within EHRs, including demographic information or disease status, which could affect estimation of disease prevalence or risk factor associations.

Objective:

In this article, we applied methods from literature on missing data and causal inference to assess whether we could mitigate information biases when estimating measures of association between potential risk factors and diabetes among a patient population of New York City (NYC) young adults.

Methods:

We estimated odds ratios (OR) for diabetes by race/ethnicity and asthma status using EHR data from NYU Langone Health. Methods from the missing data and causal inference literature were then applied to assess the ability to control for misclassification of health outcomes in the EHR data. We compared EHR-based associations with associations observed from two national health surveys, the Behavioral Risk Factor Surveillance System and National Health and Nutrition Examination Survey, representing traditional public health surveillance systems.

Results:

Observed EHR-based associations between race/ethnicity and diabetes were comparable to health survey-based estimates, but the association between asthma and diabetes was significantly overestimated (OR EHR=3.01 vs. OR BRFSS=1.23). Missing data and causal inference methods reduced information biases in these estimates, yielding relative differences from traditional estimates below 50% (OR Missing Data=1.79, OR Causal=1.42).

Conclusions:

Findings suggest that without bias adjustment, EHR analyses may yield biased measures of association, driven in part by subgroup differences in healthcare utilization patterns. However, applying missing data or causal inference frameworks can help control for and, importantly, characterize residual information biases in these estimates.

Citation

Please cite as:

Conderino S, Anthopolos R, Albrecht SS, Farley SM, Divers J, Titus AR, Thorpe LE

Addressing Information Biases Within Electronic Health Record Data to Improve the Examination of Epidemiologic Associations With Diabetes Prevalence Among Young Adults: Cross-Sectional Study

JMIR Med Inform 2024;12:e58085

DOI: 10.2196/58085

PMID: 39353204

PMCID: 11460830

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Mar 5, 2024

Date Accepted: Jul 10, 2024

Addressing Information Biases within Electronic Health Record Data to Improve Examination of Epidemiologic Associations with Diabetes Prevalence among Young Adults: Cross-Sectional Study

ABSTRACT

Citation

Copyright