JMIR Preprints #44191: Automated identification of aspirin exacerbated respiratory disease using natural language processing and machine learning

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Automated identification of aspirin exacerbated respiratory disease using natural language processing and machine learning

Thanai Pongdee;
Nicholas Larson;
Rohit Divekar;
Suzette Bielinski;
Hongfang Liu;
Sungrim Moon

ABSTRACT

Background:

Aspirin exacerbated respiratory disease (AERD) is an acquired inflammatory condition characterized by the presence of asthma, chronic rhinosinusitis with nasal polyposis, and respiratory hypersensitivity reactions on ingestion of aspirin or other nonsteroidal anti-inflammatory drugs (NSAIDs). Despite AERD having a classic constellation of symptoms, the diagnosis is often overlooked, with an average of greater than ten years between the onset of symptoms and diagnosis of AERD. Without a diagnosis, individuals will lack opportunities to receive effective treatments such as aspirin desensitization or biologic medications.

Objective:

To develop a natural language processing (NLP) algorithm to identify patients with AERD from an electronic health record (EHR).

Methods:

A rule-based NLP algorithm was developed using clinical documents from the EHR at Mayo Clinic. From clinical notes, seven features were extracted that included the following: AERD, asthma, NSAID allergy, nasal polyps, chronic sinusitis, elevated urine leukotriene E4 level, and documented no-NSAID allergy by a health care provider. MedTagger was used to extract these seven features from the unstructured clinical text given a set of keywords and patterns based on the chart review of two allergy/immunology experts for AERD. The status of each extracted feature was represented as either present or absent per subject. To determine the representative combination of features to discriminate the different AERD features, we utilized the entropy approach of the decision tree classifier.

Results:

The AERD-NLP algorithm achieved an accuracy of 88.00% (95% CI=[82.10, 91.74]) and 84.50% (95% CI=[78.73, 89.22]) for the training and test set, respectively.

Conclusions:

We developed a promising AERD-NLP algorithm that needs further refinement to improve AERD diagnosis accuracy. Continued development of NLP and other artificial intelligence technologies has the potential to reduce diagnostic delays for AERD and improve the health of our patients.

Citation

Please cite as:

Pongdee T, Larson N, Divekar R, Bielinski S, Liu H, Moon S

Automated Identification of Aspirin-Exacerbated Respiratory Disease Using Natural Language Processing and Machine Learning: Algorithm Development and Evaluation Study

JMIR AI 2023;2:e44191

DOI: 10.2196/44191

PMID: 39105270

PMCID: 11296676

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR AI

Date Submitted: Nov 9, 2022

Date Accepted: May 22, 2023

Automated identification of aspirin exacerbated respiratory disease using natural language processing and machine learning

ABSTRACT

Citation

Copyright