JMIR Preprints #80969: Machine learning algorithms to predict venous thromboembolism in patients with sepsis in the intensive care unit: A multicenter retrospective study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Machine learning algorithms to predict venous thromboembolism in patients with sepsis in the intensive care unit: A multicenter retrospective study

Yan Zhang;
Xia Ren;
Luojie Liu;
Junjie Zha;
Yijie Gu;
Hongwei Ye

ABSTRACT

Background:

Venous thromboembolism (VTE) is a common and severe complication in intensive care unit (ICU) patients with sepsis. Conventional risk stratification tools lack sepsisspecific features and may inadequately capture complex, nonlinear interactions among clinical variables.

Objective:

This study aimed to develop and validate an interpretable machine learning (ML) model for the early prediction of VTE in septic ICU patients.

Methods:

This multicenter retrospective study utilized data from the Medical Information Mart for Intensive Care (MIMIC-IV) database for model development and internal validation, and an independent cohort from Changshu Hospital for external validation. Candidate predictors were selected through univariate analysis, followed by least absolute shrinkage and selection operator (LASSO) regression. Variables retained by LASSO were used in multivariable logistic regression to identify independent predictors, which were then used to develop nine ML models, including categorical boosting (CatBoost), decision tree (DT), k-nearest neighbor (KNN), light gradient boosting machine (LGBM), logistic regression (LR), multilayer perceptron (MLP), naive Bayes (NB), random forest (RF), and support vector machine (SVM). Model performance was evaluated by discrimination (area under the receiver operating characteristic curve, AUC), calibration, and clinical utility (decision curve analysis, DCA). Model interpretability was assessed using SHapley Additive exPlanations (SHAP) to quantify the contribution of individual features to the predicted risk.

Results:

A total of 25,197 patients from the MIMIC-IV cohort and 328 patients from the external cohort were included, with VTE incidences of 3.35% and 9.15%, respectively. The LGBM model demonstrated the best performance, achieving an AUC of 0.956 in internal validation and 0.786 in external validation. Calibration curves indicated strong agreement between predicted and observed outcomes, and DCA showed superior net benefit across clinically relevant thresholds. SHAP analysis identified central venous catheterization, serum chloride and bicarbonate levels, arterial catheterization, and prolonged partial thromboplastin time (PTT) as the most influential predictors. Partial dependence plots revealed both linear and nonlinear associations between these variables and VTE risk. Individual-level force plots further enhanced interpretability by visualizing personalized risk profiles.

Conclusions:

We developed a high-performing and interpretable ML model for predicting VTE in ICU patients with sepsis. By integrating diverse clinical data and leveraging SHAP for transparent explanations, this tool may support personalized prophylaxis and early diagnostic strategies to reduce VTErelated morbidity and mortality in septic ICU populations.

Citation

Please cite as:

Zhang Y, Ren X, Liu L, Zha J, Gu Y, Ye H

Machine Learning Algorithms to Predict Venous Thromboembolism in Patients With Sepsis in the Intensive Care Unit: Multicenter Retrospective Study

JMIR Med Inform 2026;14:e80969

DOI: 10.2196/80969

PMID: 41617215

PMCID: 12905564

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jul 20, 2025

Open Peer Review Period: Jul 20, 2025 - Sep 14, 2025

Date Accepted: Dec 29, 2025

(closed for review but you can still tweet)

Machine learning algorithms to predict venous thromboembolism in patients with sepsis in the intensive care unit: A multicenter retrospective study

ABSTRACT

Citation

Copyright