JMIR Preprints #76542: Evaluating biomedical feature fusion on machine learning’s predictability and interpretability of COVID-19 severity types: Model Development, Interpretation and Validation

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Evaluating biomedical feature fusion on machine learning’s predictability and interpretability of COVID-19 severity types: Model Development, Interpretation and Validation

Haleigh Noelle West-Page;
Kevin McGoff;
Harrison Latimer;
Isaac Olufadewa;
Shi Chen

ABSTRACT

Background:

Accurately differentiating severe from non-severe COVID-19 clinical types is critical for the healthcare system to optimize workflow. Current techniques lack the ability to accurately predict COVID-19 patients’ clinical type, especially as SARS-CoV-2 continues to mutate.

Objective:

We explore predictability and interpretability of multiple state-of-the-art machine learning (ML) techniques trained and tested under different biomedical data types and COVID-19 variants.

Methods:

Comprehensive patient-level data were collected from 362 patients (214 severe, 148 non-severe) with the original SARS-CoV-2 variant in 2020 and 1000 patients (500 severe, 500 non-severe) with the Omicron variant in 2022-2023. The data included 26 biochemical features from blood testing and 26 clinical features from patients’ clinical characteristics and medical history. Different ML techniques including penalized logistic regression (LR), random forest (RF), k-nearest neighbors (kNN), and support vector machines (SVM) were applied to build predictive models based on each data modality separately and together for each variant. Fifty randomized train-test-splits were conducted per scenario and performance results were recorded.

Results:

The fused (hybrid) characteristic modality yielded the highest mean area under the curve (AUC) achieving 0.915, while the biochemical modality alone and the clinical modality alone had AUCs of 0.862 and 0.818 respectively. All ML models performed similarly under different testing scenarios and were robust when cross-tested with original and Omicron variant patient data. Our models ranked elevated d-dimer (biochemical), elevated high sensitivity troponin I (biochemical), and age greater than 55 years (clinical) as the most predictive features of severe COVID-19.

Conclusions:

ML is a powerful tool for predicting severe COVID-19 based on comprehensive individual patient-level data. Further, ML models trained on the biochemical and clinical modalities together witness enhanced predictive power. The improved performance of these ML models when trained and cross-tested with Omicron variant data supports the robustness of ML as a tool for clinical decision support.

Citation

Please cite as:

West-Page HN, McGoff K, Latimer H, Olufadewa I, Chen S

Evaluating Biomedical Feature Fusion on Machine Learning’s Predictability and Interpretability of COVID-19 Severity Types: Model Development, Interpretation, and Validation

JMIR Form Res 2026;10:e76542

DOI: 10.2196/76542

PMID: 42060527

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Formative Research

Date Submitted: Apr 28, 2025

Open Peer Review Period: Apr 25, 2025 - Jun 20, 2025

Date Accepted: Apr 1, 2026

(closed for review but you can still tweet)

Evaluating biomedical feature fusion on machine learning’s predictability and interpretability of COVID-19 severity types: Model Development, Interpretation and Validation

ABSTRACT

Citation

Copyright