Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jul 16, 2022
Date Accepted: Sep 7, 2022

The final, peer-reviewed published version of this preprint can be found here:

Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study

Li Y, Wang X, Zou L, Xu L, Hui L

Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study

JMIR Med Inform 2022;10(10):e41136

DOI: 10.2196/41136

PMID: 36264604

PMCID: 9634522

Relation Extraction in Biomedical Texts: Development of a Multi-Head Attention Model with Syntactic Dependency Feature

  • Yongbin Li; 
  • Xiaohua Wang; 
  • Liping Zou; 
  • Luo Xu; 
  • Linhu Hui

ABSTRACT

Background:

With the rapid expansion of biomedical literature, biomedical information extraction (IE) has attracted more and more attention by researchers, especially the relation extraction (RE) between two entities is a long-term research topic.

Objective:

This paper focuses on two multi-class relation extraction tasks of BioNLP 2019 Open Shared Tasks: relation extraction of Bacteria-Biotope task (BB-rel) and binary relation extraction of plant seed development task (SeeDev-binary). In essence, these two tasks are aimed to extract the relation between annotated entity pairs from biomedical texts, which is a challenging problem.

Methods:

The traditional research methods adopted feature-based or kernel-based methods, and achieved good performance. For these tasks, we propose a deep learning model based on a combination of several distributed features, such as domain-specific word embedding, part-of-speech (POS) embedding, entity type embedding, distance embedding and position embedding. The Multi-Head attention mechanism is used to extract the global semantic features of a whole sentence. Meanwhile, we introduce dependency type feature and shortest dependency path connecting two candidate entities in the syntactic dependency graph to enrich the feature representation.

Results:

Experiments show that our proposed model has excellent performance in biomedical relation extraction, achieves the F1-scores of 65.56% and 38.04% on the test sets of these two tasks, respectively. Especially in SeeDev-binary task, the F1-score of our model is superior to other existing models and achieves state-of-the-art performance.

Conclusions:

We demonstrated that the Multi-Head attention mechanism can learn relevant syntactic and semantic features in different representation subspaces and different positions to extract comprehensive feature representation. Moreover, syntactic dependency features can improve the performance of the model by learning dependency relation between the entities in biomedical texts.


 Citation

Please cite as:

Li Y, Wang X, Zou L, Xu L, Hui L

Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study

JMIR Med Inform 2022;10(10):e41136

DOI: 10.2196/41136

PMID: 36264604

PMCID: 9634522

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.