Currently submitted to: JMIR Medical Informatics
Date Submitted: Oct 20, 2023
Date Accepted: Apr 20, 2024
Clinician-Based Validation of a Machine Learning Application to Classify Patients at Differing Levels of Risk of Opioid Use Disorder
ABSTRACT
Background:
Despite restrictive opioid management guidelines, opioid use disorder (OUD) remains a major public health concern. Machine learning (ML) offers a promising avenue for identifying and alerting clinicians about OUD, thus supporting better clinical decision-making regarding treatment. The performance of a ML application to alert clinicians of a patient’s risk of OUD, was evaluated by comparing it to a structured chart review by clinicians.
Objective:
To assess the clinical validity of an ML-based application designed to identify and alert clinicians of different levels of patients’ OUD risk.
Methods:
The ML-application generated OUD risk alerts on outpatient data for 649,504 patients from 2 medical centers between 2010–2013. A random sample of 60 patients was selected from each of 3 OUD risk level categories (n=180). An OUD risk classification scheme and standardized data extraction tool were developed to evaluate the validity of the alerts. Clinicians independently conducted a systematic and structured chart review and came to consensus on a patient’s OUD risk level which was then compared to the ML-application’s risk assignments.
Results:
78,587 non-cancer patients with at least 1 opioid prescription were identified as: Not High Risk (64.1%), High Risk (21.2%), and Suspected OUD/OUD (14.7%). The sample of 180 patients was representative of the total population in age, sex, and race. The inter-rater reliability between the ML-application and clinicians had a weighted kappa coefficient (95% Cl) of 0.62 (0.53, 0.71), indicating good agreement. Combining the High Risk and Suspected OUD/OUD categories and using the chart review as a ‘gold standard’, the ML application had a corrected sensitivity (95% CI) of 56.6% (48.7%, 64.5%) and the corrected specificity of 94.2% (90.3%, 98.1%). The positive and negative predictive value (95% CI) was 93.3% (88.2%, 96.3%) and 60.0% (50.4%, 68.9%), respectively. Key themes for disagreements between the ML-application and clinician reviews were identified.
Conclusions:
A systematic comparison was conducted between an ML system and clinicians for OUD risk identification. The ML-application generated clinically valid and useful alerts about patients’ different risk levels of OUD. ML-applications hold promise for identifying patients at differing levels of OUD risk and will likely complement traditional rule-based approaches to generating alerts about opioid safety issues. Clinical Trial: Not applicable
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.