JMIR Preprints #24020: Family History Extraction from Electronic Health Records

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Family History Extraction from Electronic Health Records

Maciej Rybinski;
Xiang Dai;
Sonit Singh;
Sarvnaz Karimi;
Anthony Nguyen

ABSTRACT

Background:

Prognosis, diagnosis and treatment of many genetic disorders and familial diseases significantly improves if family history of the patient is known. Such information is often written in free-text of clinical notes.

Objective:

Our aim is to develop automated methods that enable access to family history data through natural language processing.

Methods:

We use information extraction using transformers for extracting disease mentions from notes. We also experiment with rule-based methods for family member extraction from text as well as coreference resolution techniques. We provide a thorough error analysis of contributing factors that affect such information extraction system.

Results:

Our experiments show that a combination of domain-adaptive pretraining together with intermediate-task pretraining method achieves a F1 Score of 81.63\% for extraction of diseases and family members from notes when tested on a public shared task dataset by National NLP Clinical Challenges. In comparison, the 2019 n2c2/OHNLP Shared-Task the median F1 score of all the 17 participating teams is 76.59\%.

Conclusions:

Our approach, which leverages state-of-the-art named entity recognition for disease mention detection, coupled with a hybrid method for family member mention detection, achieved effectiveness close to top three systems participating in the 2019 n2c2 family history extraction challenge, with only the top system outperforming it convincingly in terms of precision.

Citation

Please cite as:

Rybinski M, Dai X, Singh S, Karimi S, Nguyen A

Extracting Family History Information From Electronic Health Records: Natural Language Processing Analysis

JMIR Med Inform 2021;9(4):e24020

DOI: 10.2196/24020

PMID: 33664015

PMCID: 8092929

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Aug 31, 2020

Date Accepted: Mar 2, 2021

Date Submitted to PubMed: Mar 5, 2021

Family History Extraction from Electronic Health Records

ABSTRACT

Citation