Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Aug 31, 2020
Date Accepted: Mar 2, 2021
Date Submitted to PubMed: Mar 5, 2021

The final, peer-reviewed published version of this preprint can be found here:

Extracting Family History Information From Electronic Health Records: Natural Language Processing Analysis

Rybinski M, Dai X, Singh S, Karimi S, Nguyen A

Extracting Family History Information From Electronic Health Records: Natural Language Processing Analysis

JMIR Med Inform 2021;9(4):e24020

DOI: 10.2196/24020

PMID: 33664015

PMCID: 8092929

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Family History Extraction from Electronic Health Records

  • Maciej Rybinski; 
  • Xiang Dai; 
  • Sonit Singh; 
  • Sarvnaz Karimi; 
  • Anthony Nguyen

ABSTRACT

Prognosis, diagnosis and treatment of many genetic disorders and familial diseases significantly improves if family history of the patient is known. Such information is often written in free-text of clinical notes. Our aim is to develop automated methods that enable access to this data through natural language processing. In particular, we use information extraction using transformers for extracting disease mentions from notes. Our experiments show that a combination of domain-adaptive pretraining together with intermediate-task pretraining method achieves a F1 Score of 81.63% for extraction of diseases and family members from notes when tested on a public shared task dataset by National NLP Clinical Challenges. In comparison, the 2019 n2c2/OHNLP Shared-Task the median F1 score of all the 17 participating teams is 76.59%.


 Citation

Please cite as:

Rybinski M, Dai X, Singh S, Karimi S, Nguyen A

Extracting Family History Information From Electronic Health Records: Natural Language Processing Analysis

JMIR Med Inform 2021;9(4):e24020

DOI: 10.2196/24020

PMID: 33664015

PMCID: 8092929

Per the author's request the PDF is not available.