Accepted for/Published in: JMIR Public Health and Surveillance
Date Submitted: Nov 12, 2023
Date Accepted: May 29, 2024
Identifying Potential Factors Associated with Racial Disparities in COVID-19 Outcomes: a Machine Learning Analysis of Real-World Data.
ABSTRACT
Background:
Racial disparities in COVID-19 incidence and outcomes have been widely reported. Non-Hispanic Blacks (NHB) suffered disproportionately compared to non-Hispanic Whites (NHW), but the epidemiological basis for these observations was complex and multifaceted.
Objective:
We seek to elucidate the reasons behind the worse COVID-19 outcomes experienced by NHB compared to NHW and how these variables interact using an explainable machine learning (ML) approach.
Methods:
In this retrospective cohort study, we examined 28,943 laboratory-confirmed COVID-19 cases from the OneFlorida Research Consortium’s data trust of healthcare recipients in Florida through April 28, 2021. We assessed the prevalence of preexisting comorbid conditions, geo-socioeconomic factors, and health outcomes in the structured electronic health records of COVID-19 cases. The primary outcome was a composite of hospitalization, intensive care unit admission, and mortality at index admission. We developed and validated a ML model using XGBoost to evaluate predictors of worse COVID-19 outcomes and rank them by importance.
Results:
Compared to NHW, NHB patients were younger, more likely to be uninsured, had a higher prevalence of ED and inpatient visits, and were in regions with higher area deprivation index rankings and pollutant concentrations. NHB patients had the highest burden of comorbidities and rates of the primary outcome. Age was a key predictor in all models, ranking highest in NHW. However, for NHB, congestive heart failure was a primary predictor. Other variables, such as food environment measures and air pollution indicators, also ranked high. By consolidating comorbidities into the Elixhauser Comorbidity Index, this became the top predictor, providing a comprehensive risk measure.
Conclusions:
The study reveals that individual and geo-socioeconomic factors significantly influence COVID-19 outcomes. It also highlights varying risk profiles among different racial groups. Recognizing these relationships is vital for creating effective, tailored interventions that reduce disparities and enhance health outcomes across all racial and socioeconomic groups. Clinical Trial: This study was approved by the OneFlorida Institutional Review Board (IRB) at the University of Florida (IRB202001531).
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.