Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: May 13, 2021
Date Accepted: Apr 26, 2022

The final, peer-reviewed published version of this preprint can be found here:

Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) in the United States: Development and Validation of a Natural Language Processing Method

Zheng C, Duffy J, Liu ILA, Sy LS, Navarro RA, Kim SS, Ryan D, Chen W, Qian L, Mercado C, Jacobsen SJ

Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) in the United States: Development and Validation of a Natural Language Processing Method

JMIR Public Health Surveill 2022;8(5):e30426

DOI: 10.2196/30426

PMID: 35608886

PMCID: 9175103

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) Using Natural Language Processing

  • Chengyi Zheng; 
  • Jonathan Duffy; 
  • In-Lu Amy Liu; 
  • Lina S. Sy; 
  • Ronald A. Navarro; 
  • Sunhea S. Kim; 
  • Denison Ryan; 
  • Wansu Chen; 
  • Lei Qian; 
  • Cheryl Mercado; 
  • Steven J. Jacobsen

ABSTRACT

Background:

Shoulder injury related to vaccine administration (SIRVA) accounts for more than half of all claims received by the National Vaccine Injury Compensation Program. However, there is a lack of population-based studies due to the challenge of identifying SIRVA cases in large health care databases.

Objective:

To develop a natural language processing (NLP) method to identify SIRVA cases from clinical notes.

Methods:

We conducted the study among members of a large integrated health care organization who were vaccinated between 04/1/2016 and 12/31/2017 and had subsequent diagnosis codes indicative of shoulder injury. Based on a training dataset with a chart review reference standard of 164 individuals, we developed an NLP algorithm to extract shoulder disorder information, including prior vaccination, anatomic location, temporality and causality. The algorithm identified three groups of positive SIRVA cases (definite, probable and possible) based on the strength of evidence. We compared NLP results to a chart review reference standard of 100 vaccinated individuals. We then applied the final automated NLP algorithm to a broader cohort of vaccinated individuals with a shoulder injury diagnosis code and performed manual chart confirmation on a random sample of NLP-identified definite cases and all NLP-identified probable and possible cases.

Results:

In the validation sample, the NLP algorithm had 100% accuracy for identifying 4 SIRVA cases and 96 individuals without SIRVA. In the broader cohort of 53,585 individuals, the NLP algorithm identified 291 definite, 124 probable, and 52 possible SIRVA cases. The chart-confirmation rates for these groups were 95.3%, 67.7% and 18.9%, respectively.

Conclusions:

The algorithm performed with high sensitivity and reasonable specificity in identifying positive SIRVA cases. The NLP algorithm can potentially be used in future population-based studies to identify this rare adverse event, avoiding labor-intensive chart review validation.


 Citation

Please cite as:

Zheng C, Duffy J, Liu ILA, Sy LS, Navarro RA, Kim SS, Ryan D, Chen W, Qian L, Mercado C, Jacobsen SJ

Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) in the United States: Development and Validation of a Natural Language Processing Method

JMIR Public Health Surveill 2022;8(5):e30426

DOI: 10.2196/30426

PMID: 35608886

PMCID: 9175103

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.