Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Aug 20, 2024
Open Peer Review Period: Sep 6, 2024 - Nov 1, 2024
Date Accepted: May 28, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Explainable Machine Learning Framework for Dynamic Monitoring of Disease Prognostic Risk: Retrospective Cohort Study

Ishikawa T, Shinoda M, Oya M, Ashizaki K, Ota S, Kamachi K, Sakurada K, Kawakami E, Shinkai M

Explainable Machine Learning Framework for Dynamic Monitoring of Disease Prognostic Risk: Retrospective Cohort Study

JMIR Form Res 2025;9:e65585

DOI: 10.2196/65585

PMID: 41084807

PMCID: 12501906

Explainable machine learning framework for dynamic monitoring of disease prognostic risk

  • Tetsuo Ishikawa; 
  • Masahiro Shinoda; 
  • Megumi Oya; 
  • Koichi Ashizaki; 
  • Shinichiro Ota; 
  • Kenichi Kamachi; 
  • Kazuhiro Sakurada; 
  • Eiryo Kawakami; 
  • Masaharu Shinkai

ABSTRACT

Background:

Patients' conditions continue to change after diagnosis, with each patient exhibiting a unique time course of progression. This variability complicates the prediction of clinical outcomes, particularly for diseases with acute exacerbations like coronavirus disease (COVID-19).

Objective:

This study aimed to perform risk classification at initial diagnosis to predict the maximum severity of COVID-19 patients, and implement dynamic risk monitoring during hospitalization to continuously assess mortality risk using longitudinal data.

Methods:

This retrospective cohort study included 382 COVID-19 patients treated at Tokyo Shinagawa Hospital between January and September 2020. Patients under 18 years old or with unknown clinical outcomes were excluded. The dataset was divided into training and validation sets (2:1 ratio) using stratified sampling. Risk classification at initial diagnosis utilized 84 variables, including symptoms, background factors, and blood/urine biomarkers. Gradient Boosting Decision Trees (GBDT) were employed to predict the highest severity level at diagnosis, and risk factors were interpreted in SHAP (SHapley Additive exPlanations). For dynamic risk assessment during hospitalization, longitudinal data from 182 inpatients were used, including 72 variables such as blood/urine biomarkers, vital signs, and background data. Random Survival Forests were applied to predict daily mortality risk, with the 7-day cumulative hazard function as the measure. SurvSHAP(t) was applied to provide a time-dependent explanation of the contribution of each variable to the prediction.

Results:

The cohort had a median age of 39 years, with 233 males and 149 females. Of the 51 inpatients requiring oxygen, 30 required low-flow oxygen, 3 high-flow oxygen, and 8 were treated with invasive ventilation or extracorporeal membrane oxygenation. Ten patients died. The GBDT model predicted COVID-19 severity with area under the receiver operating characteristic curves ranging from 0.717 to 0.970 across severity thresholds. Pneumonia was the most significant predictor for moderate cases, while age and biomarkers such as lymphocyte count, creatinine, C-reactive protein (CRP), and prothrombin time were key for severe outcomes. The dynamic mortality risk assessment during hospitalization could discriminate between deceased and surviving patients 1–2 weeks before the outcome. Early in hospitalization, CRP was an important risk factor for mortality, while in the middle period peripheral oxygen saturation increased its importance and platelets and β-D-glucan were the main risk factors immediately before death.

Conclusions:

The machine learning-based framework developed in this study demonstrated the ability to predict the maximum severity of COVID-19 at initial diagnosis with reasonable accuracy. However, the risk of deterioration for some patients evolved during hospitalization, highlighting the importance of continuous monitoring. The dynamic risk monitoring framework, which updates mortality risk predictions daily using longitudinal data, achieved high predictive performance and offered explainable predictions through SurvSHAP(t). This approach can provide healthcare professionals with real-time guidance for clinical decision-making, enabling timely interventions and better resource allocation. Clinical Trial: This study was approved by the local institutional review board of RIKEN and Tokyo Shinagawa Hospital (approval number: 20-A-06).


 Citation

Please cite as:

Ishikawa T, Shinoda M, Oya M, Ashizaki K, Ota S, Kamachi K, Sakurada K, Kawakami E, Shinkai M

Explainable Machine Learning Framework for Dynamic Monitoring of Disease Prognostic Risk: Retrospective Cohort Study

JMIR Form Res 2025;9:e65585

DOI: 10.2196/65585

PMID: 41084807

PMCID: 12501906

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.