JMIR Preprints #63147: Natural Language Processing For Identification of Hospitalized People Who Use Drugs

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Natural Language Processing For Identification of Hospitalized People Who Use Drugs

Taisuke Sato;
Emily D Grussing;
Ruchi Patel;
Jessica Ridgway;
Joji Suzuki;
Benjamin Sweigart;
Robert Miller;
Alysse G Wurcel

ABSTRACT

Background:

People Who Use Drugs (PWUD) are at heightened risk for severe injection-related infections. Current clinical practices and research mostly rely on biomarkers, medication records, ICD codes, and self-screening forms for patients to identify PWUD; the combination of these tools still often fails to identify hospitalized SUD patients, missing crucial intervention opportunities for Serious Injection Related Infections (SIRI).

Objective:

This study explores using Natural Language Processing (NLP) to enhance the equitable and comprehensive identification of PWUD in electronic medical records (EMR).

Methods:

We retrospectively compiled a cohort of hospitalizations that involved PWUD at Tufts Medical Center (2020-2022). Criteria for entering the cohort included ICD10 codes for SUD, positive drug toxicology, SUD treatment prescriptions, and specific NLP keywords. We conducted human review of clinical notes in Electronic Health Records (EHR) to calculate the positive and negative predictive value of two subcohorts: admissions associated with a diagnosis code of substance use disorder only (D-only) and admissions associated with NLP identification of drug use only (N-only). We also conducted a regression analysis to evaluate the impact of race, ethnicity, and Social Vulnerability Index (SVI) on the outcomes of highly documented drug use versus drug use only documented with NLP.

Results:

The study identified 4548 hospitalizations with broad heterogeneity in how people entered the cohort and subcohorts. 288 hospitalizations entered the cohort through NLP presence alone. NLP demonstrated a 54% positive predictive value (PPV), outperforming biomarkers, medication records, and ICD codes in identifying hospitalizations of PWUD. Additionally, NLP significantly enhanced these methods when integrated into the identification algorithm. The study also found that people from racially and ethnically minoritized communities and lower socioeconomic patients were significantly more likely to have SUD not documented in EMRs.

Conclusions:

NLP proved effective in identifying hospitalizations of PWUD, surpassing traditional methods. While further refinement is needed, NLP shows a promising capability in minimizing healthcare disparities, particularly in infectious disease care for SUD patients, highlighting a crucial step towards more equitable healthcare.

Citation

Please cite as:

Sato T, Grussing ED, Patel R, Ridgway J, Suzuki J, Sweigart B, Miller R, Wurcel AG

Natural Language Processing for Identification of Hospitalized People Who Use Drugs: Cohort Study

JMIR AI 2025;4:e63147

DOI: 10.2196/63147

PMID: 40680182

PMCID: 12294639

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR AI

Date Submitted: Jun 11, 2024

Date Accepted: Mar 17, 2025

Natural Language Processing For Identification of Hospitalized People Who Use Drugs

ABSTRACT

Citation

Copyright