Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jun 17, 2022
Date Accepted: Jul 31, 2022

The final, peer-reviewed published version of this preprint can be found here:

Development and Evaluation of a Natural Language Processing Annotation Tool to Facilitate Phenotyping of Cognitive Status in Electronic Health Records: Diagnostic Study

Noori A, Magdamo C, Liu X, Tyagi T, Kondepudi AV, Alabsi H, Rudman E, Wilcox D, Brenner L, Robbins GK, Moura L, Hsu J, Zafar S, Benson N, Serrano-Pozo A, Dickson J, Hyman BT, Blacker D, Westover B, Mukerji SS, Das S

Development and Evaluation of a Natural Language Processing Annotation Tool to Facilitate Phenotyping of Cognitive Status in Electronic Health Records: Diagnostic Study

J Med Internet Res 2022;24(8):e40384

DOI: 10.2196/40384

PMID: 36040790

PMCID: 9472045

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Development and Evaluation of a Natural Language Processing Annotation Tool to Facilitate Phenotyping of Cognitive Status in Electronic Health Records

  • Ayush Noori; 
  • Colin Magdamo; 
  • Xiao Liu; 
  • Tanish Tyagi; 
  • Akhil Venkatmani Kondepudi; 
  • Haitham Alabsi; 
  • Emily Rudman; 
  • Doug Wilcox; 
  • Laura Brenner; 
  • Gregory K. Robbins; 
  • Lidia Moura; 
  • John Hsu; 
  • Sahar Zafar; 
  • Nicole Benson; 
  • Alberto Serrano-Pozo; 
  • John Dickson; 
  • Bradley T. Hyman; 
  • Deborah Blacker; 
  • Brandon Westover; 
  • Shibani S. Mukerji; 
  • Sudeshna Das

ABSTRACT

Background:

Electronic Health Records (EHR) with large sample sizes and rich information offer great potential for dementia research but current methods of phenotyping cognitive status are not scalable.

Objective:

To evaluate whether Natural Language Processing (NLP)-powered semi-automated annotation can improve the speed and interrater reliability of chart reviews for phenotyping cognitive status

Methods:

In this diagnostic study, we developed and evaluated a semi-automated NLP-powered annotation tool (NAT) to facilitate phenotyping of cognitive status. Clinical experts adjudicated the cognitive status of 627 patients at Mass General Brigham (MGB) Healthcare using NAT or traditional chart reviews. Patient charts contained EHR data from two datasets: (1) Records from January 1, 2017 to December 31, 2018 for 100 Medicare beneficiaries from the MGB Accountable Care Organization (ACO), and (2) Records from 2-years pre-COVID diagnosis to the date of COVID diagnosis for 527 MGB patients. All EHR data from the relevant period were extracted; diagnosis codes, medications, and laboratory test values were processed and summarized; clinical notes were processed through an NLP pipeline; and a web tool was developed to present an integrated view of all data. Cognitive status was rated as cognitively normal, cognitively impaired, or undetermined. Assessment time and interrater agreement of NAT compared to manual chart reviews for cognitive status phenotyping was evaluated.

Results:

NAT adjudication provided higher interrater agreement (Cohen k=0.89 vs. k=0.80) and significant speed up (time difference mean [SD]: 1.4 [1.3] minutes, P < 0.001; ratio median [min, max]: 2.2 [0.4, 20]) over manual chart reviews. There was moderate agreement with manual chart reviews (Cohen k=0.67). In the cases that exhibited disagreement with manual chart review, NAT adjudication was able to produce assessments that had broader clinical consensus due to its integrated view of highlighted relevant information and semi-automated NLP features.

Conclusions:

NAT adjudication improves the speed and interrater reliability for phenotyping cognitive status compared to manual chart reviews. This study underscores the potential of an NLP-based clinically adjudicated method to build large-scale dementia research cohorts from EHR.


 Citation

Please cite as:

Noori A, Magdamo C, Liu X, Tyagi T, Kondepudi AV, Alabsi H, Rudman E, Wilcox D, Brenner L, Robbins GK, Moura L, Hsu J, Zafar S, Benson N, Serrano-Pozo A, Dickson J, Hyman BT, Blacker D, Westover B, Mukerji SS, Das S

Development and Evaluation of a Natural Language Processing Annotation Tool to Facilitate Phenotyping of Cognitive Status in Electronic Health Records: Diagnostic Study

J Med Internet Res 2022;24(8):e40384

DOI: 10.2196/40384

PMID: 36040790

PMCID: 9472045

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.