Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.
Who will be affected?
Readers: No access to all 28 journals. We recommend accessing our articles via PubMed Central
Authors: No access to the submission form or your user account.
Reviewers: No access to your user account. Please download manuscripts you are reviewing for offline reading before Wednesday, July 01, 2020 at 7:00 PM.
Editors: No access to your user account to assign reviewers or make decisions.
Copyeditors: No access to user account. Please download manuscripts you are copyediting before Wednesday, July 01, 2020 at 7:00 PM.
Lam C, Tso CF, Green-Saxena A, Pellegrini E, Iqbal Z, Evans D, Hoffman J, Calvert J, Mao Q, Das R
Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study
Semi-supervised deep learning from time series clinical data for acute respiratory distress syndrome prediction: model development and validation study
Carson Lam;
Chak Foon Tso;
Abigail Green-Saxena;
Emily Pellegrini;
Zohora Iqbal;
Daniel Evans;
Jana Hoffman;
Jacob Calvert;
Qingqing Mao;
Ritankar Das
ABSTRACT
Background:
A high number of patients hospitalized with COVID-19 also develop Acute Respiratory Distress Syndrome (ARDS).
Objective:
In response to the need for clinical decision support tools to help manage the pandemic, we developed machine learning algorithms to predict ARDS in general and COVID-19 populations.
Methods:
Semi-supervised machine learning (SSL) techniques were also applied to 29127 encounters from patients admitted to seven United States hospitals from 5/1/2019-5/1/2021. A recurrent neural network (RNN) using a time series of electronic health record (EHR) data was applied at the time peripheral oxygen saturation (SpO2) falls below the normal range (< 97%) to predict subsequent development of ARDS in the remaining duration of the hospital stay. Model performance was assessed with regard to area under the receiver operating characteristic (AUROC) and area under the precision recall curve (AUPRC) on an external hold out test set.
Results:
In the whole dataset the median time between the first SpO2 measurement < 97% and subsequent respiratory failure was 21 hours. The AUC for predicting subsequent ARDS was 0.73 when training on a labeled dataset of 6930 patients, 0.78 when training on the labeled dataset that had been augmented with the unlabeled dataset of 16,173 patients using SSL techniques and 0.84 when training on the entire training set of 23,103 labeled patients.
Conclusions:
In the setting of time series inpatient data, with careful model training design, unlabeled data can be used to improve the performance of machine learning when labeled data is scarce or expensive. Clinical Trial: N/A
Citation
Please cite as:
Lam C, Tso CF, Green-Saxena A, Pellegrini E, Iqbal Z, Evans D, Hoffman J, Calvert J, Mao Q, Das R
Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study