Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Mar 19, 2021
Date Accepted: Jul 5, 2021
Employing AI with NLP to Combine EHR’s Structured and Free Text Data to Identify NVAF to Decrease Strokes and Death
ABSTRACT
Background:
Non-Valvular Atrial Fibrillation (NVAF), affects almost 6 million Americans and is a major contributor to strokes; but is significantly undiagnosed and undertreated despite explicit guidelines for oral anticoagulation.
Objective:
We investigate if use of semi-supervised natural language processing (NLP) of electronic health record’s (EHRs’) free-text information combined with structured EHR data improves NVAF discovery and treatment--perhaps offering a method to prevent thousands of deaths and save billions of dollars.
Methods:
We abstracted a set of 96,681 participants from the University at Buffalo’s faculty practice’s EHR. NLP was used to index the notes and compare the ability to identify NVAF, CHA2DS2 VASc and HAS-BLED scores using unstructured data (ICD codes) vs. Structured plus Unstructured data from clinical notes. Additionally, we analyzed data from 63,296,120 participants in the Optum and Truven databases to determine the NVAF’s frequency, rates of CHA2DS2 VASc ≥ 2 and no contraindications to oral anticoagulants (OAC), rates of stroke and death in the untreated population, and first year’s costs after stroke. 16,17
Results:
The structured-plus-unstructured method would have identified 3,976,056 additional true NVAF cases (p<0.001) and improved sensitivity for CHA2DS2-VASc and HAS-BLED scores compared to the structured data alone (P=0.00195, and P<0.001 respectively), a 32.1% improvement. For the US this method would prevent an estimated 176,537 strokes, save 10,575 lives, and save over $13.5 billion.
Conclusions:
AI-informed bio-surveillance combining NLP of free-text information with structured EHR data improves data completeness, could prevent thousands of strokes, and save lives and funds. This method is applicable to many disorders, with profound public health consequences. Clinical Trial: None
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.