JMIR Preprints #31063: Development of a pipeline for Adverse Drug Reactions Identification in clinical Notes (ADRIN): Word embedding models and string matching

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Development of a pipeline for Adverse Drug Reactions Identification in clinical Notes (ADRIN): Word embedding models and string matching

Klaske R. Siegersma;
Maxime Evers;
Sophie H. Bots;
Floor Groepenhoff;
Yolande Appelman;
Leonard Hofstra;
Igor I. Tulevski;
G. Aernout Somsen;
Hester M. Den Ruijter;
Marco Spruit;
N. Charlotte Onland-Moret

ABSTRACT

Background:

Knowledge about adverse drug reactions (ADRs) in the population is limited due to underreporting, which hampers surveillance and assessment of drug safety. Therefore, gathering accurate information about incidence of ADRs is of great relevance, which can be retrieved from clinical notes. However, manual labelling of these notes is time-consuming and automatization can improve use of free text clinical notes for identification of ADRs. Furthermore, tools for language processing in languages other than English are not widely available.

Objective:

To design and evaluate a method for automatic extraction of medication and ADRs Identification in clinical Notes (ADRIN)

Methods:

Dutch free text clinical notes (n=277.398) and medication registrations (n=499.435) were used from the Cardiology Centers of the Netherlands database. All clinical notes were used to develop word embedding models. Vector representations of word embedding models and a string matching with a medical dictionary (MedDRA) were used for identification of ADRs and medication in a test set of clinical notes that was manually labelled. Several settings, including search area and punctuation, could be adjusted in the prototype to evaluate the optimal version of the prototype.

Results:

The ADRIN method was evaluated using a test set 988 clinical notes, written on the stop date of a drug. Multiple versions of the prototype were evaluated for various task. Binary classification of ADR presence achieved the highest accuracy of 0.84. Reduced search area and inclusion of punctuation improves performance of the pipeline.

Conclusions:

The ADRIN method and prototype are effective in recognizing ADRs in Dutch clinical notes from cardiac diagnostic screening centers. Surprisingly, incorporation of MedDRA did not result in improved identification on top of word embedding models. The implementation of the ADRIN tool may help to increase the identification of ADRs, resulting in better care and saving substantial health care costs. Clinical Trial: N/A

Citation

Please cite as:

Siegersma KR, Evers M, Bots SH, Groepenhoff F, Appelman Y, Hofstra L, Tulevski II, Somsen GA, Den Ruijter HM, Spruit M, Onland-Moret NC

Development of a Pipeline for Adverse Drug Reaction Identification in Clinical Notes: Word Embedding Models and String Matching

JMIR Med Inform 2022;10(1):e31063

DOI: 10.2196/31063

PMID: 35076407

PMCID: 8826143

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jun 10, 2021

Date Accepted: Nov 14, 2021

Development of a pipeline for Adverse Drug Reactions Identification in clinical Notes (ADRIN): Word embedding models and string matching

ABSTRACT

Citation

Copyright