Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Mar 8, 2022
Date Accepted: Aug 12, 2022

The final, peer-reviewed published version of this preprint can be found here:

Mining Severe Drug Hypersensitivity Reaction Cases in Pediatric Electronic Health Records: Methodology Development and Applications

Yu Y, Zhao Q, Cao W, Wang X, Li Y, Xie Y, Wang X

Mining Severe Drug Hypersensitivity Reaction Cases in Pediatric Electronic Health Records: Methodology Development and Applications

JMIR Med Inform 2022;10(9):e37812

DOI: 10.2196/37812

PMID: 36099001

PMCID: 9516376

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Mining Severe Drug Hypersensitivity Reaction Cases in Pediatrics Electronic Health Records with Limited Resources: Methodology Development and Applications

  • Yuncui Yu; 
  • Qiuye Zhao; 
  • Wang Cao; 
  • Xiaochuan Wang; 
  • Yanming Li; 
  • Yuefeng Xie; 
  • Xiaoling Wang

ABSTRACT

Background:

Severe drug hypersensitivity reactions (DHRs) refers to allergic reactions that are caused by drugs and present severe skin rash and internal damages as the main symptoms. For now, the reporting of severe DHRs in hospitals solely relies on spontaneous reporting systems (SRSs), which are operated by clinicians in charge. An automatic system that scrutinizes clinical notes and reports potential severe DHR cases will help decrease the number of missed positive cases and reduce the cost of manpower at the same time.

Objective:

Design a method that automatically identifies positive DHR cases given clinical notes in the electronic health records (EHR) system. Reduce both excess labor and computing resources. Verify the effectiveness of the proposed pipeline on a well-challenged N2C2 2016 smoking task, identifying smoking status of discharged patients. Apply the verified pipeline to our own task, automatic identification of severe DHRs in pediatrics EHRs.

Methods:

Considering the limited resources of both labor and computing power, the proposed method did not rely on extensive preprocessing, feature engineering nor hyperparameter fine-tuning. The proposed pipeline consisted three stages: (1) filter long clinical notes by a list of keywords; (2) transform the filtered texts into a high-dimensional feature space by statistical algorithms or contextualized neural language models, such as pretrained BERT models; and (3) train stochastic gradient descent (SGD), a machine learning classifier, in the high-dimensional feature space and classify each transformed document of clinical notes into a predefined category. The proposed method was verified on an openly available N2C2 2016 smoking task first. Then it was applied to automatic identify severe DHRs, both on an annotated dataset and in a nine years EHRs of pediatrics.

Results:

In the smoking task, the results showed that the domain-specific pretrained language model, ClinicalBERT (94.06%) and DischargeBERT(93.07%) outperformed the open-domain model, Bert-base-uncased(91.09%) by using filtered texts. The effectiveness of this proposed pipeline was verified by reaching the record of the state-of-the-art performance on this challenge (94.1% vs 94.2%). The proposed method was applied to the DHRs task with little transfer work. It was found that the domain-specific pretrained language model, Medbert-kd-chinese(89.09%), outperformed the Bert-base-chinese models(88.18%) and the TF-IDF baseline (83.64%).The model was then applied to a nine years of EHRs in Beijing Children’s Hospital, and a total of 1155 cases were alerted. After double-checking by clinician experts, 357 cases of severe DHRs were finally identified.

Conclusions:

It is worth considering various machine learning and deep learning models for a specific phenotyping task. The proposed method in this work is worth exploring especially considering its speed-up development process and low cost in computing.


 Citation

Please cite as:

Yu Y, Zhao Q, Cao W, Wang X, Li Y, Xie Y, Wang X

Mining Severe Drug Hypersensitivity Reaction Cases in Pediatric Electronic Health Records: Methodology Development and Applications

JMIR Med Inform 2022;10(9):e37812

DOI: 10.2196/37812

PMID: 36099001

PMCID: 9516376

Per the author's request the PDF is not available.