Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Feb 17, 2021
Date Accepted: Aug 1, 2021
Date Submitted to PubMed: Aug 16, 2021

The final, peer-reviewed published version of this preprint can be found here:

Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study

Lam C, Tso CF, Green-Saxena A, Pellegrini E, Iqbal Z, Evans D, Hoffman J, Calvert J, Mao Q, Das R

Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study

JMIR Form Res 2021;5(9):e28028

DOI: 10.2196/28028

PMID: 34398784

PMCID: 8447921

Semi-supervised deep learning from time series clinical data for acute respiratory distress syndrome prediction: model development and validation study

  • Carson Lam; 
  • Chak Foon Tso; 
  • Abigail Green-Saxena; 
  • Emily Pellegrini; 
  • Zohora Iqbal; 
  • Daniel Evans; 
  • Jana Hoffman; 
  • Jacob Calvert; 
  • Qingqing Mao; 
  • Ritankar Das

ABSTRACT

Background:

A high number of patients hospitalized with COVID-19 also develop Acute Respiratory Distress Syndrome (ARDS).

Objective:

In response to the need for clinical decision support tools to help manage the pandemic, we developed machine learning algorithms to predict ARDS in general and COVID-19 populations.

Methods:

Semi-supervised machine learning (SSL) techniques were also applied to 29127 encounters from patients admitted to seven United States hospitals from 5/1/2019-5/1/2021. A recurrent neural network (RNN) using a time series of electronic health record (EHR) data was applied at the time peripheral oxygen saturation (SpO2) falls below the normal range (< 97%) to predict subsequent development of ARDS in the remaining duration of the hospital stay. Model performance was assessed with regard to area under the receiver operating characteristic (AUROC) and area under the precision recall curve (AUPRC) on an external hold out test set.

Results:

In the whole dataset the median time between the first SpO2 measurement < 97% and subsequent respiratory failure was 21 hours. The AUC for predicting subsequent ARDS was 0.73 when training on a labeled dataset of 6930 patients, 0.78 when training on the labeled dataset that had been augmented with the unlabeled dataset of 16,173 patients using SSL techniques and 0.84 when training on the entire training set of 23,103 labeled patients.

Conclusions:

In the setting of time series inpatient data, with careful model training design, unlabeled data can be used to improve the performance of machine learning when labeled data is scarce or expensive. Clinical Trial: N/A


 Citation

Please cite as:

Lam C, Tso CF, Green-Saxena A, Pellegrini E, Iqbal Z, Evans D, Hoffman J, Calvert J, Mao Q, Das R

Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study

JMIR Form Res 2021;5(9):e28028

DOI: 10.2196/28028

PMID: 34398784

PMCID: 8447921

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.