Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Oct 19, 2021
Date Accepted: Apr 12, 2022
Date Submitted to PubMed: May 3, 2022

The final, peer-reviewed published version of this preprint can be found here:

Machine Learning–Based Prediction Models for Different Clinical Risks in Different Hospitals: Evaluation of Live Performance

Sun H, Depraetere K, Meesseman L, Cabanillas Silva P, Szymanowsky R, Fliegenschmidt J, Hulde N, von Dossow V, Vanbiervliet M, De Baerdemaeker J, Roccaro-Waldmeyer DM, Stieg J, Domínguez Hidalgo M, Dahlweid M

Machine Learning–Based Prediction Models for Different Clinical Risks in Different Hospitals: Evaluation of Live Performance

J Med Internet Res 2022;24(6):e34295

DOI: 10.2196/34295

PMID: 35502887

PMCID: 9214618

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

An evaluation of ML-based clinical risk prediction models in live EHR systems: a study on multiple use cases in different hospitals

  • Hong Sun; 
  • Kristof Depraetere; 
  • Laurent Meesseman; 
  • Patricia Cabanillas Silva; 
  • Ralph Szymanowsky; 
  • Janis Fliegenschmidt; 
  • Nikolai Hulde; 
  • Vera von Dossow; 
  • Martijn Vanbiervliet; 
  • Jos De Baerdemaeker; 
  • Diana Manuela Roccaro-Waldmeyer; 
  • Jörg Stieg; 
  • Manuel Domínguez Hidalgo; 
  • Michael Dahlweid

ABSTRACT

Background:

Machine learning (ML) algorithms are currently used in a wide array of clinical domains to produce models that can predict clinical risk events. Most models are developed and evaluated with retrospective data, very few are evaluated in a clinical workflow, and even fewer report performances in different hospitals. We provide detailed evaluations of clinical risk prediction models in live clinical workflows for three different use cases in three different hospitals.

Objective:

The main objective of this study is to evaluate the clinical risk prediction models in live clinical workflows and compare with their performance on retrospective data. We also aimed at generalizing the results by applying our investigation to three different use cases in three different hospitals.

Methods:

We trained clinical risk prediction models for three use cases (delirium, sepsis and acute kidney injury (AKI)) in three different hospitals with retrospective data. The models are deployed in these three hospitals and used in daily clinical practice. The predictions made by these models are logged and correlated with the diagnosis at discharge. We compared the performance with evaluations on retrospective data and conducted cross-hospital evaluations.

Results:

The performance of the prediction models in live clinical workflows is similar to the performance with retrospective data. The average value of area under the receiver-operating characteristic curve (AUROC) decreases slightly by 0.8 percentage point (from 89.4 % to 88.6%). The cross-hospital evaluations exhibit severe reduced performance, the averaged AUROC decreased by 8 percentage point (from 94.2% to 86.3%), which indicates the importance of model calibration with data from deployment hospitals.

Conclusions:

Calibrating the prediction model with data from different deployment hospitals leads to a good performance in live settings. The performance degradation in the cross-hospital evaluation indicates limitations in developing a generic model for different hospitals. Designing a generic model development process to generate specialized prediction models for each hospital guarantees the model performance in different hospitals.


 Citation

Please cite as:

Sun H, Depraetere K, Meesseman L, Cabanillas Silva P, Szymanowsky R, Fliegenschmidt J, Hulde N, von Dossow V, Vanbiervliet M, De Baerdemaeker J, Roccaro-Waldmeyer DM, Stieg J, Domínguez Hidalgo M, Dahlweid M

Machine Learning–Based Prediction Models for Different Clinical Risks in Different Hospitals: Evaluation of Live Performance

J Med Internet Res 2022;24(6):e34295

DOI: 10.2196/34295

PMID: 35502887

PMCID: 9214618

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.