Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Dec 31, 2024
Open Peer Review Period: Dec 31, 2024 - Feb 25, 2025
Date Accepted: Jul 15, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Analyzing Reddit Social Media Content in the United States Related to H5N1: Sentiment and Topic Modeling Study

Pang O, Movahedi Nia Z, Gillies M, Leung D, Bragazzi N, Gizo I, Kong J

Analyzing Reddit Social Media Content in the United States Related to H5N1: Sentiment and Topic Modeling Study

J Med Internet Res 2025;27:e70746

DOI: 10.2196/70746

PMID: 40925599

PMCID: 12457856

Analyzing Reddit Social Media Content in the United States Related to H5N1: A Sentiment and Topic Modeling Study

  • Oscar Pang; 
  • Zahra Movahedi Nia; 
  • Murray Gillies; 
  • Doris Leung; 
  • Nicola Bragazzi; 
  • Itlala Gizo; 
  • Jude Kong

ABSTRACT

Background:

The H5N1 avian influenza A virus represents a serious threat to both animal and human health, with the potential to escalate into a global pandemic. Effective monitoring of social media during H5N1 avian influenza outbreaks could potentially offer critical insights to guide public health strategies. Social media platforms like Reddit, with their diverse and region-specific communities, provide a rich source of data that can reveal collective attitudes, concerns, and behavioral trends in real time.

Objective:

This study aims to analyze Reddit posts from state-specific Subreddits in the United States from the most recent outbreak period of 2022 to 2024 to (1) assess the sentiments expressed as the H5N1 outbreak progresses, (2) identify predominant topics discussed, particularly those corresponding to negative sentiments, and (3) explore correlations between these sentiments or topics and the severity and spread of the outbreak in respective regions.

Methods:

Over 2,000 Reddit posts from 160 Subreddits across 11 highly impacted states from February 2022 to July 2024 were collected. Outbreak data comprising almost 600 entries were obtained from the USDA database. Sentiment classification was performed using a fine-tuned BERT Base model, and posts were categorized into six emotions: anger, fear, joy, love, sadness, and surprise, with a seventh “neutral” category added for low-confidence classifications. Topic modeling was conducted using BERTopic and LDA models. Statistical analyses included calculating correlations between sentiment intensity and outbreak severity levels, and applying the Mann-Whitney U test to assess differences between sentiment categories.

Results:

The findings showed that H5N1 outbreaks occurred in waves, with significant surges followed by lulls. States like Minnesota and Iowa were most affected, exhibiting high case counts and numerous outbreaks over time. Sentiment analysis revealed that negative emotions—“sadness,” “anger,” and “fear”—dominated discussions, comprising about 90% of posts. When accounting for a three-week delay in reactions to outbreak severity changes, weak positive correlations emerged between outbreak severity and the intensity levels of “anger,” “sadness,” and “joy” sentiments. The sentiment of “fear” showed a modest immediate correlation without temporal adjustment. Topic modeling highlighted concerns about the virus spreading among bird populations, rising egg prices due to poultry shortages, and the economic impact of culling policies.

Conclusions:

Overall, these results underscore the critical role of social media analysis in understanding public reactions, including prevalent themes and sentiments, and guiding timely, targeted public health interventions during the H5N1 outbreak.


 Citation

Please cite as:

Pang O, Movahedi Nia Z, Gillies M, Leung D, Bragazzi N, Gizo I, Kong J

Analyzing Reddit Social Media Content in the United States Related to H5N1: Sentiment and Topic Modeling Study

J Med Internet Res 2025;27:e70746

DOI: 10.2196/70746

PMID: 40925599

PMCID: 12457856

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.