Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Aug 4, 2025
Date Accepted: Dec 6, 2025

The final, peer-reviewed published version of this preprint can be found here:

Treatment Recommendations for Clinical Deterioration on the Wards: Development and Validation of Machine Learning Models

Pulick E, Carey KA, Qyli T, Oguss MK, Picart JK, Penumalee L, Nezirova LK, Tully ST, Gilbert ER, Shah NS, Ravichandran U, Afshar M, Edelson DP, Mintz Y, Churpek MM

Treatment Recommendations for Clinical Deterioration on the Wards: Development and Validation of Machine Learning Models

JMIR AI 2026;5:e81642

DOI: 10.2196/81642

PMID: 41544252

PMCID: 12810948

Treatment Recommendations for Clinical Deterioration on the Wards: Development and Validation of Machine Learning Models

  • Eric Pulick; 
  • Kyle A Carey; 
  • Tonela Qyli; 
  • Madeline K Oguss; 
  • Jamila K Picart; 
  • Leena Penumalee; 
  • Lily K Nezirova; 
  • Sean T Tully; 
  • Emily R Gilbert; 
  • Nirav S Shah; 
  • Urmila Ravichandran; 
  • Majid Afshar; 
  • Dana P Edelson; 
  • Yonatan Mintz; 
  • Matthew M Churpek

ABSTRACT

Background:

Clinical deterioration in general ward patients is associated with increased morbidity and mortality. Early and appropriate treatments can improve outcomes for such patients. While machine learning tools have proven successful in the early identification of clinical deterioration risk, little work has explored their effectiveness in providing data-driven treatment recommendations to clinicians for high-risk patients.

Objective:

This study established machine learning performance benchmarks for predicting the need for 10 common clinical deterioration interventions. This study also compared the performance of various machine learning models to inform which types of approaches are well-suited to these prediction tasks.

Methods:

We relied on a chart-reviewed, multicenter dataset of general ward patients experiencing clinical deterioration (n=2480 encounters), who were identified as high risk using a Food and Drug Administration cleared early warning score (eCART). Manual chart review labeled each encounter with gold-standard lifesaving treatment labels. We trained elastic net logistic regression, gradient boosted machines, long short-term memory, and stacking ensemble models to predict the need for 10 common deterioration interventions at the time of the deterioration early warning score. Models were trained on encounters from 3 health systems and externally validated on encounters from a fourth health system. Discriminative performance, assessed by the area under the receiver operating characteristic curve (AUC), was the primary evaluation metric.

Results:

Discriminative performance varied widely by model and prediction task, with AUCs typically ranging from 0.7-0.9. Across all models, antiarrhythmics were the easiest treatment to predict (mean AUC 0.866) while anticoagulants were the hardest to predict (mean AUC 0.660). While no individual modeling approach outperformed the others across all tasks, the gradient boosted machines tended to show the best individual performance. Additionally, the stacking ensemble, which combined predictions from all models, typically matched or outperformed the best-performing individual model for each task. We also demonstrated that a sizeable fraction of patients in our evaluation cohort were untreated at the time of the high-risk early warning flag, highlighting an opportunity to leverage ML tools to decrease treatment latency.

Conclusions:

We found variability in the discrimination of machine learning models across tasks and model approaches for predicting lifesaving treatments in patients with clinical deterioration. Overall performance was high, and these models could be paired with early warning scores to provide clinicians with timely and actionable treatment recommendations to improve patient care.


 Citation

Please cite as:

Pulick E, Carey KA, Qyli T, Oguss MK, Picart JK, Penumalee L, Nezirova LK, Tully ST, Gilbert ER, Shah NS, Ravichandran U, Afshar M, Edelson DP, Mintz Y, Churpek MM

Treatment Recommendations for Clinical Deterioration on the Wards: Development and Validation of Machine Learning Models

JMIR AI 2026;5:e81642

DOI: 10.2196/81642

PMID: 41544252

PMCID: 12810948

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.