JMIR Preprints #51409: Evaluating the longitudinal model shift of Machine Learning-based clinical risk prediction models: a study on multiple use cases across different hospitals

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Evaluating the longitudinal model shift of Machine Learning-based clinical risk prediction models: a study on multiple use cases across different hospitals

Patricia Cabanillas Silva;
Hong Sun;
Mohamed Rezk;
Diana M Roccaro-Waldmeyer;
Janis Fliegenschmidt;
Nikolai Hulde;
Vera von Dossow;
Laurent Meesseman;
Kristof Depraetere;
Joerg Stieg;
Ralph Szymanowsky;
Fried-Michael Dahlweid

ABSTRACT

Background:

In recent years, machine learning-based models are widely used in clinical domains that can predict clinical risk events. However, the performances of such models heavily rely on the data used for training and evaluation. Data shift, characterized by differences between real-world data distribution and the distribution of training and testing data, has significant implications for prediction models, leading to performance degradation and reduced clinical efficacy. Thus, monitoring data shifts and evaluating their impact on prediction models is of utmost importance.

Objective:

This study aims to assess the impact of data shifts on machine learning-based prediction models. We generalize our findings by evaluating three different use cases from two hospitals with different patient populations. Additionally, we investigate potential model deterioration during the COVID-19 pandemic period.

Methods:

We train prediction models using retrospective data from earlier years and examine the presence of data shifts and their impact on the models using data from more recent years. We use area under receiver operating characteristic curve (AUROC) as the metric to evaluate the model performance and analyse the calibration curves over time. We also assess the influence on clinical decisions by evaluating the alert rate and the rates of over and under-diagnosis.

Results:

Significant data shifts are observed when using data from earlier years. However, when training our models using more recent years, we did not observe substantial data shifts in our investigations, and the AUROC of the prediction models remained stable. Nevertheless, drifts were observed for the delirium and sepsis use cases when evaluating the calibration curves at two hospitals. Additionally, different patterns are observed regarding the changes in the alert rate and overdiagnosis rate between both hospitals. Importantly, we did not observe any model deteriorations during the COVID-19 pandemic period, the prediction models did not cause a notable surge in alerts.

Conclusions:

Clinical data undergoes continuous changes due to evolving clinical practices and workflows, which directly impact the predictions generated by clinical risk prediction models. Although model performances appear stable when assessed using AUROC, the presence of model drift becomes evident when alternative evaluation metrics are employed post-training. Consequently, it becomes crucial to closely monitor data changes and detect data shifts, along with their potential influence on the predictions generated by these models.

Citation

Please cite as:

Cabanillas Silva P, Sun H, Rezk M, Roccaro-Waldmeyer DM, Fliegenschmidt J, Hulde N, von Dossow V, Meesseman L, Depraetere K, Stieg J, Szymanowsky R, Dahlweid FM

Longitudinal Model Shifts of Machine Learning–Based Clinical Risk Prediction Models: Evaluation Study of Multiple Use Cases Across Different Hospitals

J Med Internet Res 2024;26:e51409

DOI: 10.2196/51409

PMID: 39671571

PMCID: 11681292

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Aug 3, 2023

Date Accepted: Oct 16, 2024

Evaluating the longitudinal model shift of Machine Learning-based clinical risk prediction models: a study on multiple use cases across different hospitals

ABSTRACT

Citation

Copyright