Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Jan 17, 2025
Date Accepted: Mar 31, 2025
Identification and validation of an Explainable Prediction Model of Sepsis in Critically Ill Patients with Intracerebral Hemorrhage: Multicenter Retrospective Study
ABSTRACT
Background:
Sepsis is a life-threatening condition frequently observed in critically ill patients with intracerebral hemorrhage (ICH). Early and accurate identification and prediction of sepsis are crucial. The SHapley Additive exPlainations (SHAP) technique has been employed to visualize individual variable predictions and applied in studies on heart failure, atrial fibrillation, etc. However, its use for predicting models for sepsis in patients with ICH remains limited. Therefore, this study aimed to establish and validate an explainable prediction model for sepsis using a machine learning (ML) approach.
Objective:
Therefore, this study aimed to establish and validate an explainable prediction model for sepsis using a machine learning (ML) approach.
Methods:
Patients with ICH admitted to the ICU from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database between 2008 and 2022 were divided into training and internal validation. External validation was performed utilizing the eICU Collaborative Research Database, which includes over 200,000 ICU admissions across the United States between 2014 and 2015. Sepsis following ICU admission was identified in the MIMIC-IV database using sepsis 3.0 criteria and in the eICU database based on the clinical diagnoses. The Boruta algorithm was employed for feature selection, confirming 29 features. Nine ML algorithms were employed to construct prediction models. Predictive performance was compared using several evaluation metrics, including the area under the receiver-operating characteristic curve (AUC). The Shapley Additive exPlanation technique was employed to interpret the final model.
Results:
Overall, 2414 patients with ICH were enrolled from the MIMIC-IV database, with 1689 and 725 patients assigned to the training and internal validation set, respectively. An external validation set of 2806 patients with ICH from the eICU database was utilized. Among the nine ML models tested, the CatBoost model demonstrated the best discriminative ability. After reducing features based on their importance, an explainable final Catboost model was developed using eight features. The final model accurately predicted sepsis in internal (AUC = 0.812) and external (AUC = 0.771) validations.
Conclusions:
Our explainable ML model was successfully developed to accurately predict sepsis while addressing the “black-box” challenge through interpretation of the ML technique. This model may improve outcomes for patients with ICH by providing early alerts and actionable feedback, potentially preventing or mitigating sepsis through implementing early measures, including medication adjustment.
Citation
The author of this paper has made a PDF available, but requires the user to login, or create an account.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.