Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Oct 1, 2022
Date Accepted: Mar 29, 2023
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Validation of Natural Language Processing for Identification of Venous Thromboembolism in a Clinical Decision Support System
ABSTRACT
Background:
Surveillance of venous thromboembolisms (VTEs) is necessary for improving patient safety in hospitals, but current detection may be inefficient, given the complex clinical settings. Whether capturing the available data and applying machine learning in natural language processing (NLP) can improve VTE detection adapted to different risk of VTE remains unknown.
Objective:
To validate the accuracy of NLP in clinical decision support for VTE risk assessment and integrated care (the DeVTEcare) for identifying pulmonary embolism (PE) and deep vein thromboembosis (DVT) linked with Electronic Health Records (EHRs).
Methods:
From January 1 to December 31, 2021, all adult inpatients were included as a validation cohort. The sensitivity, specificity, positive and negative likelihood ratios (LR+ and LR-) were used to analyze the diagnostic ability of DeVTEcare, with clinical expert’s labelled VTE as the gold standard.
Results:
Among 30152 patients (median age 56 years [interquartile range: 41-67], 47% female), the prevalence of VTE, PE, and DVT were 2.1%, 0.6%, 1.8%, respectively. The sensitivity, specificity, LR+, LR-, area under the receiver-operating-characteristic curve (AUC) and F1-score (95% confidence interval, CI) of NLP-facilitated any VTE detection were 89.9% (87.3% - 92.2%), 99.8% (99.8% - 99.9%), 483 (370 - 629), 0.10 (0.08 - 0.13), 0.95 (0.94 - 0.96) and 0.90 (0.90 - 0.91). The highest specificity (100% vs. 99.7% vs. 98.8%), LR+ (3202 vs. 321 vs. 77), F1-score (0.95 vs. 0.89 vs. 0.92) were in surgery department, compared to internal medicine and intensive care units (all P < .001). The good performance of the VTE detection was for the departments with low-risk VTE (low-, intermediate-, high-risk: AUC, 1.00 vs. 0.94 vs. 0.96, DeLong test P < .001), and those patients aged ≤ 65 years (vs. > 65 years: F1-score, 0.93 vs. 0.89, P < .001).
Conclusions:
NLP algorithm of DeVTEcare identified VTE well across different clinical settings, especially for the surgery units and departments with low-risk VTE. It permits to inform the accurate in-hospital VTE rate and enhance risk-classified VTE integrated care in future research.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.