Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Automated identification of aspirin exacerbated respiratory disease using natural language processing: a pilot study
ABSTRACT
Background:
Aspirin exacerbated respiratory disease (AERD) is an acquired inflammatory condition characterized by the presence of asthma, chronic rhinosinusitis with nasal polyposis, and respiratory hypersensitivity reactions on ingestion of aspirin or other nonsteroidal anti-inflammatory drugs (NSAIDs). Despite AERD having a classic constellation of symptoms, the diagnosis is often overlooked, with an average of greater than ten years between the onset of symptoms and diagnosis of AERD. Without a diagnosis, individuals will lack opportunities to receive effective treatments such as aspirin desensitization or biologic medications.
Objective:
To develop a natural language processing (NLP) algorithm to identify patients with AERD from an electronic health record (EHR).
Methods:
A rule-based NLP algorithm was developed using clinical documents from the EHR at Mayo Clinic. From clinical notes, seven features were extracted that included the following: AERD, asthma, NSAID allergy, nasal polyps, chronic sinusitis, elevated urine leukotriene E4 level, and documented no-NSAID allergy by a health care provider. MedTagger was used to extract these seven features from the unstructured clinical text given a set of keywords and patterns based on the chart review of two allergy/immunology experts for AERD. The status of each extracted feature was represented as either present or absent per subject. To determine the representative combination of features to discriminate the different AERD features, we utilized the entropy approach of the decision tree classifier.
Results:
The AERD-NLP algorithm achieved an accuracy of 88.00% (95% CI=[82.10, 91.74]) and 84.50% (95% CI=[78.73, 89.22]) for the training and test set, respectively.
Conclusions:
We developed a promising AERD-NLP algorithm that needs further refinement to improve AERD diagnosis accuracy. Continued development of NLP and other artificial intelligence technologies has the potential to reduce diagnostic delays for AERD and improve the health of our patients.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.