Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Mar 24, 2022
Date Accepted: Apr 26, 2022
Predict Postoperative Mortality with Deep Neural Networks and Natural Language Processing: Models Development and Validation
ABSTRACT
Background:
Machine learning achieves better predictions of postoperative mortality than previous prediction tools. Free text descriptions of the preoperative diagnosis and the planned procedure are available preoperatively. Because reading these descriptions helps anesthesiologists evaluate the risk of the surgery, we hypothesized that deep learning models with unstructured text could improve postoperative mortality prediction. However, it is challenging to extract meaningful concept embeddings from this unstructured clinical text.
Objective:
This study aims to develop a fusion deep learning model containing structured and unstructured features to predict the in-hospital 30-day postoperative mortality before surgery. Machine learning models for predicting postoperative mortality using preoperative data with or without free clinical text were assessed.
Methods:
We retrospectively collected preoperative anesthesia assessment, surgical information, and discharge summaries of patients undergoing general and neuraxial anesthesia from electronic medical records from 2016 to 2020. We first compared the deep neural network (DNN) with other models using the same input features to demonstrate effectiveness. Then, we combined the DNN model with Bidirectional Encoder Representations from Transformers (BERT) to extract information from clinical texts. The effects of adding text information on the model performance were compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC). Statistical significance was evaluated using P < .05.
Results:
A final cohort contains 121,313 surgeries. A total of 1,562 (1.3%) patients died within 30 days of surgery. Our BERT-DNN model achieved the highest AUROC (0.964 [95% confidence interval {CI}: 0.961-0.967]) and AUPRC (0.336 [95% CI: 0.276-0.402]). The AUROC of BERT-DNN is significantly higher than the logistic regression (AUROC 0.952 [95% CI: 0.949-0.955]) and the American Society of Anesthesiologists physical status (ASAPS) (AUROC 0.892 [95% CI: 0.887-0.896]), but not significantly higher than the DNN (AUROC 0.959 [95% CI: 0.956-0.962]) and the random forest (AUROC 0.961 [95% CI: 0.958-0.964]). The AUPRC of BERT-DNN is significantly higher than the DNN (AUPRC 0.319 [95% CI: 0.260-0.384]), the random forest (AUPRC 0.296 [95% CI: 0.239-0.360]), the logistic regression (AUPRC 0.276 [95% CI: 0.220-0.339]), and the ASAPS (AUPRC 0.149 [95% CI: 0.107-0.203]).
Conclusions:
This model has an AUPRC significantly higher than previously proposed models using no text and an AUROC significantly higher than the logistic regression and the ASAPS. This technique helps identify patients with higher risk from the text of surgical description in electronic medical records.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.