Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Nov 12, 2023
Date Accepted: May 29, 2024

The final, peer-reviewed published version of this preprint can be found here:

Identifying Potential Factors Associated With Racial Disparities in COVID-19 Outcomes: Retrospective Cohort Study Using Machine Learning on Real-World Data

Dasa O, Bai C, Sajdeya R, Kimmel SE, Pepine CJ, Gurka M, Laubenbacher R, Pearson TA, Mardini MT

Identifying Potential Factors Associated With Racial Disparities in COVID-19 Outcomes: Retrospective Cohort Study Using Machine Learning on Real-World Data

JMIR Public Health Surveill 2024;10:e54421

DOI: 10.2196/54421

PMID: 39326040

PMCID: 11467607

Identifying Potential Factors Associated with Racial Disparities in COVID-19 Outcomes: a Machine Learning Analysis of Real-World Data.

  • Osama Dasa; 
  • Chen Bai; 
  • Ruba Sajdeya; 
  • Stephen E. Kimmel; 
  • Carl J Pepine; 
  • Mathew Gurka; 
  • Reinhard Laubenbacher; 
  • Thomas A. Pearson; 
  • Mamoun T. Mardini

ABSTRACT

Background:

Racial disparities in COVID-19 incidence and outcomes have been widely reported. Non-Hispanic Blacks (NHB) suffered disproportionately compared to non-Hispanic Whites (NHW), but the epidemiological basis for these observations was complex and multifaceted.

Objective:

We seek to elucidate the reasons behind the worse COVID-19 outcomes experienced by NHB compared to NHW and how these variables interact using an explainable machine learning (ML) approach.

Methods:

In this retrospective cohort study, we examined 28,943 laboratory-confirmed COVID-19 cases from the OneFlorida Research Consortium’s data trust of healthcare recipients in Florida through April 28, 2021. We assessed the prevalence of preexisting comorbid conditions, geo-socioeconomic factors, and health outcomes in the structured electronic health records of COVID-19 cases. The primary outcome was a composite of hospitalization, intensive care unit admission, and mortality at index admission. We developed and validated a ML model using XGBoost to evaluate predictors of worse COVID-19 outcomes and rank them by importance.

Results:

Compared to NHW, NHB patients were younger, more likely to be uninsured, had a higher prevalence of ED and inpatient visits, and were in regions with higher area deprivation index rankings and pollutant concentrations. NHB patients had the highest burden of comorbidities and rates of the primary outcome. Age was a key predictor in all models, ranking highest in NHW. However, for NHB, congestive heart failure was a primary predictor. Other variables, such as food environment measures and air pollution indicators, also ranked high. By consolidating comorbidities into the Elixhauser Comorbidity Index, this became the top predictor, providing a comprehensive risk measure.

Conclusions:

The study reveals that individual and geo-socioeconomic factors significantly influence COVID-19 outcomes. It also highlights varying risk profiles among different racial groups. Recognizing these relationships is vital for creating effective, tailored interventions that reduce disparities and enhance health outcomes across all racial and socioeconomic groups. Clinical Trial: This study was approved by the OneFlorida Institutional Review Board (IRB) at the University of Florida (IRB202001531).


 Citation

Please cite as:

Dasa O, Bai C, Sajdeya R, Kimmel SE, Pepine CJ, Gurka M, Laubenbacher R, Pearson TA, Mardini MT

Identifying Potential Factors Associated With Racial Disparities in COVID-19 Outcomes: Retrospective Cohort Study Using Machine Learning on Real-World Data

JMIR Public Health Surveill 2024;10:e54421

DOI: 10.2196/54421

PMID: 39326040

PMCID: 11467607

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.