Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jun 22, 2025
Open Peer Review Period: Jun 22, 2025 - Aug 17, 2025
Date Accepted: Dec 15, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Developing and Validating a Machine Learning Algorithm to Predict the Risk of Incident Opioid Use Disorder Among OneFlorida+ Patients: Prognostic Modeling Study

Faysal JA, Lo-Ciganic W, Gellad WF, Wu Y, Harle CA, Nguyen K, Huang JL, Cochran G, Wilson DL, Staras SAS, Schmidt SO, Rosenberg EI, Nelson D, Yan S, Reisfield GM, Greene WM, Kuza C, Hasan MM

Developing and Validating a Machine Learning Algorithm to Predict the Risk of Incident Opioid Use Disorder Among OneFlorida+ Patients: Prognostic Modeling Study

J Med Internet Res 2026;28:e79482

DOI: 10.2196/79482

Developing and Validating a Machine-learning Algorithm to Predict the Risk of Incident Opioid Use Disorder among OneFlorida+ Patients: A Prognostic Modeling Study

  • Jabed Al Faysal; 
  • Weihsuan Lo-Ciganic; 
  • Walid F. Gellad; 
  • Yonghui Wu; 
  • Christopher A. Harle; 
  • Khoa Nguyen; 
  • James L. Huang; 
  • Gerald Cochran; 
  • Debbie L. Wilson; 
  • Stephanie A S Staras; 
  • Siegfried O.F. Schmidt; 
  • Eric I. Rosenberg; 
  • Danielle Nelson; 
  • Shunhua Yan; 
  • Gary M. Reisfield; 
  • William M. Greene; 
  • Courtney Kuza; 
  • Md Mahmudul Hasan

ABSTRACT

Background:

Opioid use disorder (OUD) remains a critical public health crisis in the United States. Despite widespread policy and clinical interventions, early identification of individuals at risk for developing OUD remains challenging due to limitations in traditional screening approaches and lack of individualized risk stratification methods. Machine learning (ML) methods offer an opportunity to develop timely, high-performing, and explainable predictive models that can enhance OUD prevention strategies in clinical settings.

Objective:

To develop and validate a ML model using electronic health record (EHR) data to predict 3-month risk of incident opioid use disorder (OUD) among adults initiating opioid therapy, and to stratify patients into clinically actionable risk groups.

Methods:

This prognostic modeling study used 2017–2022 OneFlorida+ EHR data to develop and validate ML algorithms predicting 3-month incident OUD risk. We included 182,083 adults (≥18 years) without cancer or hospice care who received ≥1 outpatient, non-injectable opioid prescription. Using 183 predictors measured in sequential 3-month intervals, we developed gradient boosting machine (GBM), elastic net (EN), least absolute shrinkage and selection operator (LASSO), and random forest (RF) models on randomly split training, testing, and validation sets. Model performance was assessed using C-statistics, predictive values, and NNE, with patients stratified into risk deciles for clinical applicability. Model explainability was assessed using SHapley Additive exPlanations (SHAP), and racial fairness was evaluated using Aequitas metrics.

Results:

In the validation sample (n=60,694), GBM (C-statistics=0.879, 95% CI=0.874-0.884) and EN (C-statistics=0.872, 95% CI=0.867-0.877) outperformed LASSO (C-statistics=0.846, 95% CI=0.840-0.851) and RF (C-statistics=0.798, 95%CI=0.792-0.804), with GBM model requiring the fewest predictors (n=75) for predicting 3-month incident OUD. Using the GBM algorithm to predict subsequent 3-month OUD risk, the top decile subgroup had a PPV of 15.06%, NPV of 99.8%, and NNE of 89. The top decile (n=6696) captured ~68% of patients with OUD. SHAP analysis identified age, number of outpatient visits, history of back and other pain conditions, comorbidity burden, and opioid prescribing patterns as the strongest predictors of incident OUD. Fairness assessment showed acceptable false negative rate parity across racial groups.

Conclusions:

Building on previous claims-based ML models to predict OUD in the Medicare population, our study demonstrated high discriminative performance and effective risk stratification for incident OUD using OneFlorida+ EHR data, with potential to inform early intervention. ML algorithms leveraging real-world EHR data can support clinical decision-making to proactively identify patients at risk for OUD. Clinical Trial: Not applicable


 Citation

Please cite as:

Faysal JA, Lo-Ciganic W, Gellad WF, Wu Y, Harle CA, Nguyen K, Huang JL, Cochran G, Wilson DL, Staras SAS, Schmidt SO, Rosenberg EI, Nelson D, Yan S, Reisfield GM, Greene WM, Kuza C, Hasan MM

Developing and Validating a Machine Learning Algorithm to Predict the Risk of Incident Opioid Use Disorder Among OneFlorida+ Patients: Prognostic Modeling Study

J Med Internet Res 2026;28:e79482

DOI: 10.2196/79482

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.