Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jan 23, 2020
Date Accepted: Jan 20, 2021

The final, peer-reviewed published version of this preprint can be found here:

Hybrid Deep Learning for Medication-Related Information Extraction From Clinical Texts in French: MedExt Algorithm Development Study

Jouffroy J, Feldman SF, Lerner I, Rance B, Burgun A, Neuraz A

Hybrid Deep Learning for Medication-Related Information Extraction From Clinical Texts in French: MedExt Algorithm Development Study

JMIR Med Inform 2021;9(3):e17934

DOI: 10.2196/17934

PMID: 33724196

PMCID: 8077811

MedExt: combining expert knowledge and deep learning for medication extraction from French clinical texts

  • Jordan Jouffroy; 
  • Sarah F Feldman; 
  • Ivan Lerner; 
  • Bastien Rance; 
  • Anita Burgun; 
  • Antoine Neuraz

ABSTRACT

Background:

Information related to patient medication is crucial for health care. However, up to 80% of the information resides solely in unstructured text. Manual extraction may be difficult and time-consuming. Many studies have shown the interest of natural language processing for this task but only a few on French corpus.

Objective:

We aim at developing a system to extract medication-related information from French clinical text.

Methods:

We developed a hybrid system combining an expert rule-based system (RBS), contextual word embedding (ELMo) trained on clinical notes and a deep recurrent neural network (BiLSTM-CRF). The task consists in extracting drug mentions and their related information (e.g. dosage, frequency, duration, route, condition). We manually annotated 320 clinical notes extracted from a French clinical data warehouse, to train and evaluate the model. We compared the performances of our approach to standard approaches: rule-based or machine learning only, and classic word embeddings. We evaluated the models using token level recall, precision and F-measure.

Results:

Models including RBS, ELMo and BiLSTM reached the best results: overall F-measure of 89.9%. F-measures per category were 95.3% for the medication name, 64.4% for the drug class mentions, 95.3% for the dosage, 92.2% for the frequency, 78.8% for the duration, and 62.2% for the condition of the intake.

Conclusions:

Associating expert rules, deep contextualized embedding (ELMo) and deep neural networks improves medication information extraction. Our results reveal a synergy when associating expert knowledge and latent knowledge.


 Citation

Please cite as:

Jouffroy J, Feldman SF, Lerner I, Rance B, Burgun A, Neuraz A

Hybrid Deep Learning for Medication-Related Information Extraction From Clinical Texts in French: MedExt Algorithm Development Study

JMIR Med Inform 2021;9(3):e17934

DOI: 10.2196/17934

PMID: 33724196

PMCID: 8077811

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.