Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Sepsis Prediction at Emergency Department Triage Using Natural Language Processing
ABSTRACT
Background:
Despite its high lethality, sepsis can be difficult to detect on initial presentation to the emergency department (ED). Machine learning (ML) based tools may provide avenues for earlier detection and life-saving intervention.
Objective:
We aimed to predict sepsis at the time of ED triage using natural language processing (NLP) of nursing triage notes and available clinical data.
Methods:
We constructed a retrospective cohort of all 1,234,434 consecutive ED encounters in 2015-2021 from four separate clinically heterogeneous academically affiliated EDs. After exclusion criteria were applied, the final cohort included 1,059,386 adult ED encounters. The primary outcome criteria for sepsis were presumed severe infection and acute organ dysfunction. After vectorization and dimensional reduction of triage notes and clinical data available at triage, a decision tree based ensemble (“time-of-triage”) model was trained to predict sepsis using the training subset (n=950,921). A separate (“comprehensive”) model was trained using these data and laboratory data, as it became available at 1-hour intervals, after triage. Model performances were evaluated using the test (n=108,465) subset.
Results:
Sepsis occurred in 35,318 encounters (incidence 3.45%). For sepsis prediction at the time of patient triage, using the primary definition, area under the receiver operating characteristic curve (AUC) and macro F1 score for sepsis were 0.94 and 0.60, respectively. Sensitivity, specificity, and false positive rate were 0.87, 0.85, and 0.15, respectively. The time-of-triage model accurately predicted sepsis in 81% of sepsis cases where sepsis screening was not initiated at triage and 98% of cases where sepsis screening was initiated at triage. Positive and negative predictive values were 0.18 and 0.99, respectively. For sepsis prediction utilizing laboratory data available each hour after ED arrival, AUC was 0.94 at 1 hour and peaked to 0.97 at 12 hours. When evaluating the model using the CDC Hospital Toolkit for Adult Sepsis Surveillance criteria to define sepsis, similar results were obtained. Among septic cases, sepsis was predicted in 33%, 48%, and 67% of encounters at 3, 2, and 1 hours prior to first intravenous antibiotic order, respectively.
Conclusions:
Sepsis can accurately be predicted at ED presentation using nursing triage notes and clinical information available at time of triage. This indicates that ML can facilitate timely and reliable alerting for intervention. Free-text data can improve performance of predictive modeling at time of triage and throughout the ED course.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.