Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jul 26, 2019
Date Accepted: Oct 8, 2019

The final, peer-reviewed published version of this preprint can be found here:

The Detection of Opioid Misuse and Heroin Use From Paramedic Response Documentation: Machine Learning for Improved Surveillance

Prieto JT, Scott K, McEwen D, Podewils LJ, Al-Tayyib A, Robinson J, Edwards D, Foldy S, Shlay JC, Davidson AJ

The Detection of Opioid Misuse and Heroin Use From Paramedic Response Documentation: Machine Learning for Improved Surveillance

J Med Internet Res 2020;22(1):e15645

DOI: 10.2196/15645

PMID: 31899451

PMCID: 6969388

Detecting evidence of potential opioid misuse and heroin use from paramedic response documentation: a machine learning application for improved surveillance

  • José Tomás Prieto; 
  • Kenneth Scott; 
  • Dean McEwen; 
  • Laura J Podewils; 
  • Alia Al-Tayyib; 
  • James Robinson; 
  • David Edwards; 
  • Seth Foldy; 
  • Judith C Shlay; 
  • Arthur J Davidson

ABSTRACT

Background:

Timely, precise, and localized surveillance of nonfatal events are needed to improve response and prevention of opioid-related problems in an evolving opioid crisis in the United States. Records of naloxone administration found in prehospital emergency medical services (EMS) data have helped estimate opioid overdose incidence, including nonhospital, field-treated cases. However, because naloxone is often used by EMS personnel in unconsciousness of unknown cause, attributing naloxone administration to opioid misuse and heroin use (OM) may misclassify events. Better methods are needed to identify OM.

Objective:

We sought fill this gap by developing and testing a natural language processing method that would improve identification of potential OM from paramedic documentation.

Methods:

First, we searched Denver Health paramedic trip reports from August 2017 to April 2018 for keywords naloxone, heroin, and both combined, and reviewed narratives of identified reports to determine whether they constituted true cases of OM. Then, we used this human classification as reference standard and trained four machine learning models (random forest, k-nearest neighbors, support vector machines, and L1-regularized logistic regression). We selected the algorithm that produced the highest area under the receiver operating curve (AUC) for model assessment. Finally, we compared positive predictive value (PPV) of the highest performing machine learning algorithm to PPV of searches of keywords naloxone, heroin, and combination of both, in binary classification of OM in unseen September 2018 data.

Results:

In total, 54,359 trip reports were filed from August 2017 to April 2018. Approximately 1% (594/54,359) indicated naloxone administration. Among trip reports with reviewer agreement regarding OM in the narrative, 57% (292/516) were considered to include information revealing OM. Approximately 2% of all trip reports (884/54,359) mentioned heroin in the narrative. Among trip reports with reviewer agreement, 96% (784/821) were considered to include information revealing OM. Combined results accounted for 2% of trip reports (1,298/54,359). Among trip reports with reviewer agreement, 78% (907/1,166) were considered to include information consistent with OM. The reference standard used to train and test machine learning models included details of 1,166 trip reports. L1-regularized logistic regression was the highest performing algorithm (AUC = 0.94, 95% CI: 0.91-0.97) in identifying OM. Tested on 5,983 unseen reports from September 2018, the keyword naloxone inaccurately identified and underestimated probable OM trip report cases (63 cases; PPV = 0.64). The keyword heroin yielded more cases with improved performance (129 cases; PPV = 0.99). Combined keyword and L1-regularized logistic regression classifier further improved performance (146 cases; PPV = 0.99).

Conclusions:

A machine learning application enhanced effectiveness of finding OM resulting in paramedic response in paramedic data. This approach to refining OM surveillance may lead to improved response and prevention of overdoses and other opioid-related problems in US communities.


 Citation

Please cite as:

Prieto JT, Scott K, McEwen D, Podewils LJ, Al-Tayyib A, Robinson J, Edwards D, Foldy S, Shlay JC, Davidson AJ

The Detection of Opioid Misuse and Heroin Use From Paramedic Response Documentation: Machine Learning for Improved Surveillance

J Med Internet Res 2020;22(1):e15645

DOI: 10.2196/15645

PMID: 31899451

PMCID: 6969388

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.