Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Feb 11, 2020
Date Accepted: Apr 19, 2020
Artificial intelligence-based Multimodal Risk Assessment Model for Surgical site infection (AMRAMS): a development and validation study
ABSTRACT
Background:
Surgical site infection (SSI) is one of the most frequent types of healthcare-associated infections. It increased mortality, prolonged hospital length of stay, and raised healthcare costs. Many institutions developed risk assessment models for SSI to help the surgeons preoperatively identify high-risk patients and guide clinical intervention. But most of them had low accuracy.
Objective:
We aimed to provide a solution to the Artificial intelligence-based Multimodal Risk Assessment Model for Surgical site infection (AMRAMS) for inpatients undergoing operations using routinely collected clinical data. We internally and externally validated the discriminations of the models combining various machine learning and natural language processing techniques and compared them with the National Nosocomial Infections Surveillance (NNIS) risk index.
Methods:
We retrieved inpatient records between January 1st, 2014 and June 30th, 2019, from the electronic medical record system of Rui Jin Hospital, Luwan Branch, Shanghai, China. We used data before July 1st, 2018, as the development set for internal validation and the rest as the test set for external validation. We included patient demographics, preoperative lab results and free-text preoperative notes as our features. We used the word embedding technique to encode the text information and we trained LASSO, random forest, gradient boosting decision tree (GBDT), convolutional neural network (CNN) and self-attention network on the combined data. Surgeons manually scored the NNIS risks index.
Results:
For internal bootstrapping validation, CNN yielded the highest mean (95% CI) AUROC of 0.889 (0.886-0.892), and the paired-sample t-test showed statistically significant advantages compared with the other models (P value < .0001). The self-attention network yielded the second-highest mean AUROC of 0.882 (0.878-0.886), but the AUROC was only numerically higher than the AUROC of the third-best model GBDT with text embeddings (mean AUROC of 0.881, 95% CI of 0.878-0.884, P value = .467). The AUROC of LASSO, random forest and GBDT models using text embeddings were statistically higher than the AUROC of models not using it (P value < .0001). For external validation, the self-attention network yielded the highest AUROC of 0.879. CNN was the second-best model (AUROC of 0.878), and GBDT with text embeddings was the third (AUROC of 0.872). The NNIS risk index scored by surgeons had an AUROC of 0.651.
Conclusions:
Our AMRAMS based on EMR data and deep learning methods (CNN and self-attention network) had significant advantages in accuracy compared with other conventional machine learning methods and the NNIS risk index. And the semantic embeddings of preoperative notes improved the model performance further. Our models could replace the NNIS risk index to provide personalized guidance for the preoperative intervention of SSI. And through this case, we offered an easy-to-implement solution to building multimodal RAM for other similar scenarios.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.