Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jan 27, 2021
Date Accepted: May 30, 2021
Relation Classification for Bleeding Events from Electronic Health Records: Exploration of Deep Learning Systems
ABSTRACT
Background:
Accurate detection of bleeding events from electronic health records (EHR) is crucial for identifying and characterizing different common and serious medical problems. To extract such information from EHRs, it is essential to identify the relations between bleeding events and related clinical entities (e.g., bleeding anatomic sites, lab tests). With the advent of natural language processing (NLP) and deep learning (DL) based techniques, many studies have focused on their applicability for various clinical applications. However, there has been no prior work that utilized deep learning to extract relations between bleeding events and relevant entities.
Objective:
In this study, we aim to evaluate multiple deep learning systems on a novel EHR dataset for bleeding event related relation classification.
Methods:
We first expert-annotated a new dataset of 1283 de-identified EHR notes for bleeding events and their attributes. On this dataset, we evaluated three state-of-the-art deep learning architectures, namely, convolutional neural network (CNN), graph convolutional network with attention (AGGCN) and BERT-based models (BioBERT, Bio+Clinical BERT and EhrBERT) for bleeding event relation classification task.
Results:
Our experiments show that the BERT-based models significantly outperformed CNN and AGGCN. Specifically, BioBERT achieved a macro F1 score of 0.842, outperforming both AGGCN (macro F1 score, 0.828) and CNN (macro F1 score, 0.763) by 1.4% (P<.001) and 7.9% (P<.001) respectively.
Conclusions:
In this comprehensive study, we explored and compared different DL systems to classify relations between bleeding events and other medical concepts. On our corpus, BERT-based models outperformed other deep learning models for identifying the relations of bleeding related entities. BERT-based models were benefited from their pre-trained contextualized word representation and the use of target entity representation over traditional sequence representation.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.