Accepted for/Published in: JMIR AI
Date Submitted: Apr 10, 2023
Open Peer Review Period: Apr 10, 2023 - Jun 5, 2023
Date Accepted: Jan 13, 2024
(closed for review but you can still tweet)
Improving risk prediction of Methicillin-resistant Staphylococcus aureus (MRSA) using network features
ABSTRACT
Background:
Healthcare-associated infections (HAI) due to multi-drug resistant organisms (MDROs), such as Methicillin-resistant Staphylococcus aureus (MRSA) and C. difficile, place a significant burden on our healthcare infrastructure.
Objective:
Screening for MDROs is an important mechanism for preventing spread but is resource intensive. Automated tools that can predict colonization/infection risk using Electronic Health Record (EHR) data could be provided useful information to aid infection control and guide empiric antibiotic coverage.
Methods:
Retrospective development of machine learning model to detect MRSA colonization and infection in undifferentiated patients at the time of sample collection in hospitalized patients at the University of Virginia hospital. We use clinical and non-clinical features derived from on-admission and throughout-stay information from the patient’s EHR data to build the model. Additionally, we use a class of features derived from contact networks in EHR data - these network features can capture patients’ contacts with providers and other patients, improving model interpretability and accuracy for predicting the outcome of surveillance tests for MRSA. Finally, we explore heterogeneous models for different patient subpopulations, e.g., those admitted to an ICU or ED or with specific testing histories, which have better performance.
Results:
We find that the logistic regression performs better than other methods, and the performance (ROC-AUC) of this model improves by nearly 11% when we use polynomial (2nd degree) transformation of the features. Some of the features which are significant in predicting MDRO risk include antibiotic usage, surgery, device, dialysis, patient’s comorbidity conditions, and network features. Among these, network features add the most value and improve the model performance by at least 15%. The logistic regression model with the same transformation of features also performs better than other models for specific patient subpopulations.
Conclusions:
Our work shows that MRSA risk prediction can be conducted quite effectively by machine learning methods using clinical and non-clinical features derived from EHR data. Network features are most predictive and give significant improvement over prior methods. Further, heterogeneous prediction models for different patient subpopulations enhance the model's performance.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.