JMIR Preprints #36877: Exploring the applicability of using natural language processing to support nationwide venous thromboembolism surveillance

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Exploring the applicability of using natural language processing to support nationwide venous thromboembolism surveillance

Aaron Wendelboe;
Ibrahim Saber;
Justin Dvorak;
Alys Adamski;
Natalie Feland;
Nimia Reyes;
Karon Abe;
Thomas Ortel;
Gary Raskob

ABSTRACT

Background:

Conducting public health surveillance for venous thromboembolism (VTE) at a national scale is important for measuring the disease burden and the impact of prevention measures. Integrating natural language processing (NLP) into VTE surveillance may be an efficient and accurate option in establishing a sustainable and cost-effective national surveillance system.

Objective:

We evaluated the performance of the VTE identification instance of IDEAL-X, an NLP tool, in automatically classifying cases of VTE from “reading” unstructured text from diagnostic imaging records.

Methods:

Accessing imaging records from pilot surveillance systems for VTE from Duke University and the University of Oklahoma Health Sciences Center (OUHSC) during 2012–2014, we used a VTE identification model of IDEAL-X to classify cases of VTE that had previously been manually classified according to pre-defined criteria. The performance measures (and 95% confidence intervals [CI]) calculated were accuracy, sensitivity, specificity, and positive and negative predictive values.

Results:

The VTE model of IDEAL-X “read” 1591 records from Duke University and 1487 records from OUHSC for a total of 3078 records. The combined performance measures were 93.7% accuracy (95% CI: 93.7%–93.8%), 96.3% sensitivity (95% CI: 96.2%–96.4%), 92.0% specificity (95% CI: 91.9%–92.0%), 89.1% positive predictive value (95% CI: 89.0%–89.2%), and 97.3% negative predictive value (95% CI: 97.3%–97.4%). The sensitivity was higher at Duke University (97.9%, 95% CI: 97.8%–98.0%) than at OUHSC (93.8%, 95% CI: 93.5%–94.0%), but the specificity was higher at OUHSC (96.3%, 95% CI: 96.2%–96.3%) than at Duke University (90.4%, 95% CI: 90.1%–90.5%).

Conclusions:

The VTE model of IDEAL-X accurately classified cases of VTE from pilot surveillance systems from 2 separate states and health systems. NLP is a promising tool in the design and implementation of an automated national surveillance system for VTE.

Citation

Please cite as:

Wendelboe A, Saber I, Dvorak J, Adamski A, Feland N, Reyes N, Abe K, Ortel T, Raskob G

Exploring the Applicability of Using Natural Language Processing to Support Nationwide Venous Thromboembolism Surveillance: Model Evaluation Study

JMIR Bioinform Biotech 2022;3(1):e36877

DOI: 10.2196/36877

PMID: 37206160

PMCID: 10193259

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Bioinformatics and Biotechnology

Date Submitted: Jan 31, 2022

Date Accepted: Jul 21, 2022

Exploring the applicability of using natural language processing to support nationwide venous thromboembolism surveillance

ABSTRACT

Citation

Copyright