Natural Language Processing For Identification of Hospitalized People Who Use Drugs
ABSTRACT
Background:
People Who Use Drugs (PWUD) are at heightened risk for severe injection-related infections. Current clinical practices and research mostly rely on biomarkers, medication records, ICD codes, and self-screening forms for patients to identify PWUD; the combination of these tools still often fails to identify hospitalized SUD patients, missing crucial intervention opportunities for Serious Injection Related Infections (SIRI).
Objective:
This study explores using Natural Language Processing (NLP) to enhance the equitable and comprehensive identification of PWUD in electronic medical records (EMR).
Methods:
We retrospectively compiled a cohort of hospitalizations that involved PWUD at Tufts Medical Center (2020-2022). Criteria for entering the cohort included ICD10 codes for SUD, positive drug toxicology, SUD treatment prescriptions, and specific NLP keywords. We conducted human review of clinical notes in Electronic Health Records (EHR) to calculate the positive and negative predictive value of two subcohorts: admissions associated with a diagnosis code of substance use disorder only (D-only) and admissions associated with NLP identification of drug use only (N-only). We also conducted a regression analysis to evaluate the impact of race, ethnicity, and Social Vulnerability Index (SVI) on the outcomes of highly documented drug use versus drug use only documented with NLP.
Results:
The study identified 4548 hospitalizations with broad heterogeneity in how people entered the cohort and subcohorts. 288 hospitalizations entered the cohort through NLP presence alone. NLP demonstrated a 54% positive predictive value (PPV), outperforming biomarkers, medication records, and ICD codes in identifying hospitalizations of PWUD. Additionally, NLP significantly enhanced these methods when integrated into the identification algorithm. The study also found that people from racially and ethnically minoritized communities and lower socioeconomic patients were significantly more likely to have SUD not documented in EMRs.
Conclusions:
NLP proved effective in identifying hospitalizations of PWUD, surpassing traditional methods. While further refinement is needed, NLP shows a promising capability in minimizing healthcare disparities, particularly in infectious disease care for SUD patients, highlighting a crucial step towards more equitable healthcare.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.