Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Oct 1, 2022
Date Accepted: Mar 29, 2023

The final, peer-reviewed published version of this preprint can be found here:

Natural Language Processing in a Clinical Decision Support System for the Identification of Venous Thromboembolism: Algorithm Development and Validation

Jin Z, Zhang H, Tai MH, Yang Y, Yao Y, Guo Y

Natural Language Processing in a Clinical Decision Support System for the Identification of Venous Thromboembolism: Algorithm Development and Validation

J Med Internet Res 2023;25:e43153

DOI: 10.2196/43153

PMID: 37093636

PMCID: 10167583

Natural Language Processing for Identification of Venous Thromboembolism in a Clinical Decision Support System: Validation Study

  • Zhigeng Jin; 
  • Hui Zhang; 
  • Mei-Hui Tai; 
  • Ying Yang; 
  • Yuan Yao; 
  • Yutao Guo

ABSTRACT

Background:

Surveillance of venous thromboembolisms (VTEs) is necessary for improving patient safety in hospitals, but current detection may be inefficient, given the complex clinical settings. Whether capturing the available data and applying machine learning in natural language processing (NLP) can improve VTE detection adapted to different risk of VTE remains unknown.

Objective:

To validate the accuracy of NLP in clinical decision support for VTE risk assessment and integrated care (the DeVTEcare) for identifying pulmonary embolism (PE) and deep vein thromboembosis (DVT) linked with Electronic Health Records (EHRs).

Methods:

From January 1 to December 31, 2021, all adult inpatients were included as a validation cohort. The sensitivity, specificity, positive and negative likelihood ratios (LR+ and LR-) were used to analyze the diagnostic ability of DeVTEcare, with clinical expert’s labelled VTE as the gold standard.

Results:

Among 30152 patients (median age 56 years [interquartile range: 41-67], 47% female), the prevalence of VTE, PE, and DVT were 2.1%, 0.6%, 1.8%, respectively. The sensitivity, specificity, LR+, LR-, area under the receiver-operating-characteristic curve (AUC) and F1-score (95% confidence interval, CI) of NLP-facilitated any VTE detection were 89.9% (87.3% - 92.2%), 99.8% (99.8% - 99.9%), 483 (370 - 629), 0.10 (0.08 - 0.13), 0.95 (0.94 - 0.96) and 0.90 (0.90 - 0.91). The highest specificity (100% vs. 99.7% vs. 98.8%), LR+ (3202 vs. 321 vs. 77), F1-score (0.95 vs. 0.89 vs. 0.92) were in surgery department, compared to internal medicine and intensive care units (all P < .001). The good performance of the VTE detection was for the departments with low-risk VTE (low-, intermediate-, high-risk: AUC, 1.00 vs. 0.94 vs. 0.96, DeLong test P < .001), and those patients aged ≤ 65 years (vs. > 65 years: F1-score, 0.93 vs. 0.89, P < .001).

Conclusions:

NLP algorithm of DeVTEcare identified VTE well across different clinical settings, especially for the surgery units and departments with low-risk VTE. It permits to inform the accurate in-hospital VTE rate and enhance risk-classified VTE integrated care in future research.


 Citation

Please cite as:

Jin Z, Zhang H, Tai MH, Yang Y, Yao Y, Guo Y

Natural Language Processing in a Clinical Decision Support System for the Identification of Venous Thromboembolism: Algorithm Development and Validation

J Med Internet Res 2023;25:e43153

DOI: 10.2196/43153

PMID: 37093636

PMCID: 10167583

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.