JMIR Preprints #68143: Identifying people living with or those at risk for HIV in a nationally-sampled electronic health record repository called the National Clinical Cohort Collaborative (N3C): A cohort study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Identifying people living with or those at risk for HIV in a nationally-sampled electronic health record repository called the National Clinical Cohort Collaborative (N3C): A cohort study

Eric Hurwitz;
Cara D. Varley;
A. Jerrod Anzolone;
Vithal Madhira;
Amy L. Olex;
Jing Sun;
Dimple Vaidya;
Nada Fadul;
Jessica Y. Islam;
Lesley E. Jackson;
Kenneth J. Wilkins;
Zachary Butzin-Dozier;
Dongmei Li;
Sandra E. Safo;
Julie A. McMurry;
Pooja Maheria;
Tommy Williams;
Shukri A. Hassan;
Melissa A. Haendel;
Rena C. Patel;
The National Clinical Cohort Collaborative (N3C) Consortium

ABSTRACT

Background:

Electronic health records (EHR) provide valuable insights to address clinical and epidemiological research concerning HIV, including the disproportionate impact of the COVID-19 pandemic on this population. To identify people living with HIV (PLWH), most studies using EHR or claims databases start with diagnostic codes, which can result in misclassification without further refinement using drug or laboratory data. Furthermore, given that antiretrovirals now have indications for both HIV and COVID-19 (i.e., ritonavir in nirmatrelvir/ritonavir), new phenotyping methods are needed to better capture PLWH. Therefore, we created a generalizable and innovative method to robustly identify PLWH, pre-exposure prophylaxis (PrEP) users, post-exposure prophylaxis (PEP) users, and people not living with HIV (PNLWH) using granular clinical data after the emergence of COVID-19.

Objective:

The primary aim of this work was to use computational phenotyping in EHR data to identify PLWH (cohort 1), people prescribed PrEP (cohort 2), PEP (cohort 3) or none of the above (PNLWH, cohort 4), and describe COVID-19 related characteristics among these cohorts.

Methods:

We used diagnostic, laboratory measurement, and drug concepts within the National Clinical Cohort Collaborative (N3C) to create a computational phenotype for four cohorts with confidence levels. For robustness, we conducted a randomly sampled, blinded clinician annotation to assess precision. We calculated the distribution of demographics, comorbidities, and COVID-19 variables among our four cohorts.

Results:

We identified 132,664 PLWH with a high level of confidence, 36,088 PrEP users, 4,120 PEP users and 20,639,675 PNLWH. Most PLWH were identified by a combination of conditions, laboratory measurements, and drug exposures (74,809, 56.4%), followed by labs and drugs (15,241, 11.5%), then conditions and drugs (14,595, 11.0%). A higher proportion of PLWH experienced COVID-19-related hospitalization 4,650 (3.51%) or mortality 828 (0.62%), and all-cause mortality 2,083 (1.57%) compared to other cohorts.

Conclusions:

Using an extensive phenotyping algorithm leveraging granular data in an EHR repository, we have identified PLWH, PNLWH, PrEP and PEP users, and offer transferable lessons to optimize future EHR phenotyping for these cohorts.

Citation

Please cite as:

Hurwitz E, Varley CD, Anzolone AJ, Madhira V, Olex AL, Sun J, Vaidya D, Fadul N, Islam JY, Jackson LE, Wilkins KJ, Butzin-Dozier Z, Li D, Safo SE, McMurry JA, Maheria P, Williams T, Hassan SA, Haendel MA, Patel RC, The National Clinical Cohort Collaborative (N3C) Consortium

Identifying People Living With or Those at Risk for HIV in a Nationally Sampled Electronic Health Record Repository Called the National Clinical Cohort Collaborative: Computational Phenotyping Study

JMIR Med Inform 2025;13:e68143

DOI: 10.2196/68143

PMID: 40644699

PMCID: 12299939

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Oct 29, 2024

Open Peer Review Period: Oct 29, 2024 - Dec 24, 2024

Date Accepted: May 15, 2025

(closed for review but you can still tweet)

Identifying people living with or those at risk for HIV in a nationally-sampled electronic health record repository called the National Clinical Cohort Collaborative (N3C): A cohort study

ABSTRACT

Citation

Copyright