Accepted for/Published in: JMIR Public Health and Surveillance
Date Submitted: Sep 27, 2022
Open Peer Review Period: Sep 27, 2022 - Oct 11, 2022
Date Accepted: Apr 11, 2023
(closed for review but you can still tweet)
Predictive Modeling of Lapses in Care for People Living with HIV in Chicago: Algorithm Development and Interpretation
ABSTRACT
Background:
Reducing lapses in care for people living with HIV (PLWH) is critical to ending the HIV epidemic and beneficial for the individual health of PLWH. Predictive modeling can identify clinical factors that are associated with lapses in HIV care. Previous studies have identified clinical factors that are associated with lapses in HIV care at the clinic level or national level, but public health strategies to improve retention in care in the U.S. often occur at the city or county level.
Objective:
We sought to build predictive models of lapses in HIV care using a large, multi-site, non-curated database of electronic health records (EHR) in Chicago.
Methods:
We used data between 2011 to 2019 from the Chicago Area Patient-Centered Outcomes Research Network (CAPriCORN), a database that includes 11 health systems containing 12.8 million patients, covering the majority of PLWH in Chicago. CAPriCORN uses a hash-based data deduplication method to follow people across multiple Chicago healthcare systems with different EHRs, providing a unique city-wide view on retention in care for PLWH. From the database, we utilized diagnosis codes, medications, laboratory tests, demographics, and encounter information to build predictive models. Our primary outcome was lapses in HIV care, which we defined as having more than 12 months between subsequent HIV care encounters. We built logistic regression, random forest, and elastic net logistic regression models using all variables and compared their performance to a baseline logistic regression model containing only demographics and retention history.
Results:
We included PLWH with at least two HIV care encounters in the database, resulting in 16,930 PLWH with a total of 191,492 encounters. All models outperformed the baseline logistic regression model, with the most improvement from the elastic net logistic regression model (AUC 0.754 [0.746 - 0.762] vs 0.674 [0.664-0.683], P<.001). Top predictors included retention history, being seen by an Infectious Disease provider (vs. primary care provider), site of care, Hispanic ethnicity, and laboratory testing for gonorrhea and chlamydia. The random forest model (AUC 0.751 [0.742-0.759]) revealed age, insurance type, and chronic comorbidities such as hypertension as important in predicting a lapse in care.
Conclusions:
We used a real-world approach to leverage the full scope of data available in modern EHRs to predict lapses in HIV care. Our findings reinforce previously known factors, such as history of prior lapses in care, while also showing the importance of laboratory testing, chronic comorbidities, sociodemographic characteristics, and individual clinic-specific factors for predicting lapses in care for PLWH in Chicago. This work provides a framework for others to use data from multiple different healthcare systems within a single city to examine lapses in care using EHR data.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.