Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Dec 15, 2023
Date Accepted: Sep 9, 2024
Complete Blood Count and MDW-based Machine Learning Algorithms for Sepsis Detection: a Multicentric Development and External Validation Study
ABSTRACT
Background:
Sepsis is an organ dysfunction caused by a dysregulated host response to infection. Early detection is fundamental to improve the patient outcome. Laboratory Medicine can have a crucial role by providing biomarkers whose alteration could be detected before onset of clinical signs and symptoms. In particular, the relevance of Monocyte Distribution Width (MDW) as a sepsis biomarker has emerged in the previous decade. Despite encouraging results, however, MDW has poor sensitivity and positive predictive value when compared to other biomarkers.
Objective:
Machine Learning (ML) techniques offer the promise to overcome the above-mentioned limitations, by combining different parameters and therefore improving sepsis detection performance. Making ML models function in clinical practice, however, may be problematic, as their performance may suffer when deployed in contexts other than the research environment: in fact, even widely used commercially available models have been demonstrated to generalize poorly in out-of-distribution scenarios. The aim of this multi-centric study was to develop and externally validate ML models whose intended use is the early detection and screening of sepsis on the basis of MDW and other Complete Blood Count parameters.
Methods:
Five patient cohorts (encompassing 5344 patients) collected at five different Italian hospitals were used to train and externally validate six ML models. To improve generalizability and robustness to different types of data distribution shifts, the developed ML models combine traditional ML methodologies with advanced techniques inspired by controllable AI, namely: cautious classification, which gives the ML models the ability to abstain from making predictions; and explainable AI, which provides clinicians and health operators with useful information about the models' functioning.
Results:
The developed models achieved good diagnostic performance on the internal validation (AUC between 0.91 and 0.98) as well as consistent generalization performance across the external validation datasets (AUC between 0.75 and 0.95), outperforming baseline biomarkers and state-of-the-art ML models for sepsis detection. Controllable AI techniques were further able to improve performance, and were used to derive a simple, interpretable set of diagnostic rules.
Conclusions:
Our findings demonstrate how controllable AI approaches based on CBC and MDW may be used for the early detection of sepsis, while also demonstrating how the proposed methodology can be used to develop ML models that are more resistant to different types of data distribution shifts.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.