Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Mar 19, 2021
Date Accepted: Jul 5, 2021

The final, peer-reviewed published version of this preprint can be found here:

Using Artificial Intelligence With Natural Language Processing to Combine Electronic Health Record’s Structured and Free Text Data to Identify Nonvalvular Atrial Fibrillation to Decrease Strokes and Death: Evaluation and Case-Control Study

Elkin P, Mullin S, Mardekian J, Crowner C, Sakilay S, Sinha S, Brady G, Wright M, Nolen K, Trainer J, Koppel R, Schlegel D, Kaushik S, Zhao J, Song B, Anand E

Using Artificial Intelligence With Natural Language Processing to Combine Electronic Health Record’s Structured and Free Text Data to Identify Nonvalvular Atrial Fibrillation to Decrease Strokes and Death: Evaluation and Case-Control Study

J Med Internet Res 2021;23(11):e28946

DOI: 10.2196/28946

PMID: 34751659

PMCID: 8663460

Employing AI with NLP to Combine EHR’s Structured and Free Text Data to Identify NVAF to Decrease Strokes and Death

  • Peter Elkin; 
  • Sarah Mullin; 
  • Jack Mardekian; 
  • Chris Crowner; 
  • Sylvester Sakilay; 
  • Shyamashree Sinha; 
  • Gary Brady; 
  • Marcia Wright; 
  • Kim Nolen; 
  • JoAnn Trainer; 
  • Ross Koppel; 
  • Daniel Schlegel; 
  • Sashank Kaushik; 
  • Jane Zhao; 
  • Buer Song; 
  • Edwin Anand

ABSTRACT

Background:

Non-Valvular Atrial Fibrillation (NVAF), affects almost 6 million Americans and is a major contributor to strokes; but is significantly undiagnosed and undertreated despite explicit guidelines for oral anticoagulation.

Objective:

We investigate if use of semi-supervised natural language processing (NLP) of electronic health record’s (EHRs’) free-text information combined with structured EHR data improves NVAF discovery and treatment--perhaps offering a method to prevent thousands of deaths and save billions of dollars.

Methods:

We abstracted a set of 96,681 participants from the University at Buffalo’s faculty practice’s EHR. NLP was used to index the notes and compare the ability to identify NVAF, CHA2DS2 VASc and HAS-BLED scores using unstructured data (ICD codes) vs. Structured plus Unstructured data from clinical notes. Additionally, we analyzed data from 63,296,120 participants in the Optum and Truven databases to determine the NVAF’s frequency, rates of CHA2DS2 VASc ≥ 2 and no contraindications to oral anticoagulants (OAC), rates of stroke and death in the untreated population, and first year’s costs after stroke. 16,17

Results:

The structured-plus-unstructured method would have identified 3,976,056 additional true NVAF cases (p<0.001) and improved sensitivity for CHA2DS2-VASc and HAS-BLED scores compared to the structured data alone (P=0.00195, and P<0.001 respectively), a 32.1% improvement. For the US this method would prevent an estimated 176,537 strokes, save 10,575 lives, and save over $13.5 billion.

Conclusions:

AI-informed bio-surveillance combining NLP of free-text information with structured EHR data improves data completeness, could prevent thousands of strokes, and save lives and funds. This method is applicable to many disorders, with profound public health consequences. Clinical Trial: None


 Citation

Please cite as:

Elkin P, Mullin S, Mardekian J, Crowner C, Sakilay S, Sinha S, Brady G, Wright M, Nolen K, Trainer J, Koppel R, Schlegel D, Kaushik S, Zhao J, Song B, Anand E

Using Artificial Intelligence With Natural Language Processing to Combine Electronic Health Record’s Structured and Free Text Data to Identify Nonvalvular Atrial Fibrillation to Decrease Strokes and Death: Evaluation and Case-Control Study

J Med Internet Res 2021;23(11):e28946

DOI: 10.2196/28946

PMID: 34751659

PMCID: 8663460

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.