Accepted for/Published in: JMIR Formative Research
Date Submitted: Sep 19, 2022
Date Accepted: Feb 19, 2023
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Comparison of machine learning algorithms for predicting hospital readmissions and worsening heart failure events in patients with heart failure with reduced ejection fraction
ABSTRACT
Background:
It is crucial to identify those patients who are at high risk of subsequent events following heart failure (HF) hospitalization among patients with heart failure with reduced ejection fraction (HFrEF).
Objective:
This study compared different machine learning (ML) prediction models and feature construction methods to predict 30-day, 90-day, and 365-day hospital readmissions and worsening heart failure events (WHFEs).
Methods:
The study utilized the Veradigm PINNACLE outpatient registry® linked to Symphony Health’s Integrated Dataverse (IDV) data from July 1, 2013, to September 30, 2017. Adults with a confirmed diagnosis of HFrEF and a HF-related hospitalization were included. WHFEs were defined as HF-related hospitalizations or outpatient intravenous diuretic use within one year following the first HF hospitalization. We utilized different approaches to construct ML features from clinical codes: (1) Frequencies of Clinical Classification Software (CCS) categories; (2) Bidirectional Encoder Representations from Transformers (BERT) trained with CCS sequences (BERT + CCS); (3) BERT trained on raw clinical codes (BERT + raw); and (4) pre-specified features based on clinical knowledge. Multilayer perceptron neural network (MLP NN), eXtreme Gradient Boosting (XGBoost), random forest, and logistic regression prediction models were applied and compared.
Results:
A total of 30,687 adult patients with HFrEF were included. The rates of 30-day readmission and a WHFE were 11.4% and 42.8%, respectively. The prediction models and feature combinations with the best area under the receiver operating characteristic curve (AUC) for each outcome were: XGBoost with CCS frequency (0.595) for 30-day readmission; random forest with CCS frequency (0.630) for 90-day readmission; XGBoost with CCS frequency (0.649) for 365-day readmission; and XGBoost with CCS frequency (0.640) for WHFEs.
Conclusions:
These results demonstrate that ML models provide modest discrimination of HF events among HFrEF patients. Features identified by data-driven approaches may be comparable to those identified by clinical domain knowledge. Future work may be warranted to validate and further improve the models using more comprehensive and complete data. Clinical Trial: N/A
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.