Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Feb 25, 2020
Open Peer Review Period: Feb 25, 2020 - Apr 21, 2020
Date Accepted: May 13, 2020
Date Submitted to PubMed: May 27, 2020
(closed for review but you can still tweet)
Extraction of Information Related to Drug Safety Surveillance From Electronic Health Record Notes: Joint Modeling of Entities and Relations Using Knowledge-aware Neural Attentive Models
ABSTRACT
Background:
An adverse drug event (ADE) is commonly defined as “an injury resulting from medical intervention related to a drug”. Providing information related to ADEs and alerting caregivers at the point-of-care can reduce the risk of prescription and diagnosis errors, and improve health outcomes. ADEs captured in Electronic Health Records (EHR) structured data, as either coded problems or allergies, are often incomplete leading to underreporting. It is therefore important to develop capabilities to process unstructured EHR data in the form of clinical notes, which contain richer documentation of a patient’s adverse drug events. Several natural language processing (NLP) systems were previously proposed to automatically extract information related to ADEs. However, the results from these systems showed that significant improvement is still required for automatic extraction of ADEs from clinical notes.
Objective:
The objective of this study is to improve automatic extraction of ADEs and related information such as drugs and their reason for administration from patient clinical notes.
Methods:
This research was conducted using discharge summaries from the MIMIC-III database obtained through the National NLP Clinical Challenges (n2c2) annotated with Drugs, drug attributes (Strength, Form, Frequency, Route, Dosage, Duration), Adverse Drug Events, Reasons, and relations between drugs and other entities. We developed a deep learning–based system for extracting these drug–centric concepts and relations simultaneously using a joint method enhanced with contextualized embeddings, a position-attention mechanism, and knowledge representations. The joint method generated different sentence representations with respect to each drug, which were then used to extract related concepts and relations simultaneously. Contextualized representations trained on the MIMIC-III database were used to capture context¬–sensitive meanings of words. The position-attention mechanism amplified benefits of the joint method by generating sentence representations that capture long-distance relations. Knowledge representations were obtained from graph embeddings created using the FAERS database to improve relation extraction, especially when contextual clues are insufficient.
Results:
Our system achieved new state-of-the-art results on the n2c2 dataset, with significant improvements in recognizing the crucial Drug-->Reason (F1 0.650 vs 0.579) and Drug-->ADE (0.490 vs 0.476) relations.
Conclusions:
We present a system for extracting drug–centric concepts and relations that outperformed current state-of-the-art results. We show that contextualized embeddings, position-attention mechanism and knowledge graph embeddings effectively improve deep learning–based concept and relation extraction. This study demonstrates the further potential for deep learning–based methods to help extract real world evidence from unstructured patient data for drug safety surveillance.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.