Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Nov 2, 2020
Date Accepted: Jan 20, 2021
Date Submitted to PubMed: Jan 26, 2021

The final, peer-reviewed published version of this preprint can be found here:

Tracking COVID-19 Discourse on Twitter in North America: Infodemiology Study Using Topic Modeling and Aspect-Based Sentiment Analysis

Jang H, Rempel E, Roth D, Carenini G, Janjua NZ

Tracking COVID-19 Discourse on Twitter in North America: Infodemiology Study Using Topic Modeling and Aspect-Based Sentiment Analysis

J Med Internet Res 2021;23(2):e25431

DOI: 10.2196/25431

PMID: 33497352

PMCID: 7879725

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Tracking COVID-19 Discourse on Twitter in North America: Topic Modeling and Aspect-based Sentiment Analysis

  • Hyeju Jang; 
  • Emily Rempel; 
  • David Roth; 
  • Giuseppe Carenini; 
  • Naveed Z. Janjua

ABSTRACT

Background:

Social media is a rich source where we can learn about people’s reactions to social issues. As COVID-19 has significantly impacted on people’s lives, it is essential to capture how people react to public health interventions and understand their concerns.

Objective:

We aim to investigate people’s reactions and concerns about COVID-19 in North America, especially focusing on Canada.

Methods:

We analyze COVID-19 related tweets using topic modeling and aspect-based sentiment analysis (ABSA), and interpret the results with public health experts. To generate insights on the effectiveness of specific public health interventions for COVID-19, we compare timelines of topics discussed with timing of implementation of interventions, synergistically including information on people’s sentiment about COVID-19 related aspects in our analysis. In addition, to further investigate anti-Asian racism, we compare timelines of sentiments for Asians and Canadians.

Results:

Topic modeling identified 20 topics and public health experts provided interpretations of the topics based on top-ranked words and representative tweets for each topic. The interpretation and timeline analysis showed that the discovered topics and their trend are highly related to public health promotions and interventions, such as physical distancing, border restrictions, hand washing, staying-home, and face coverings. After training the data using ABSA with human-in-the-loop, we obtained 545 aspect terms (e.g., “vaccines”, “economy”, and “masks”) and 60 opinion terms (e.g., “infectious”- negative, and “professional”- positive), which were used for inference of sentiments of 20 selected aspects. The results showed negative sentiments related to overall outbreak, misinformation, and Asians and positive sentiments related to physical distancing.

Conclusions:

Analyses using Natural Language Processing (NLP) techniques with domain expert involvement can produce useful information for public health. This study is the first to analyze COVID-19 related tweets in Canada in comparison with tweets in the United States by using topic modeling and human-in-the-loop domain-specific aspect-based sentiment analysis. This kind of information could help public health agencies to understand public concerns as well as what public health messages are resonating in our populations who use Twitter, which can be helpful for public health agencies when designing a policy for new interventions.


 Citation

Please cite as:

Jang H, Rempel E, Roth D, Carenini G, Janjua NZ

Tracking COVID-19 Discourse on Twitter in North America: Infodemiology Study Using Topic Modeling and Aspect-Based Sentiment Analysis

J Med Internet Res 2021;23(2):e25431

DOI: 10.2196/25431

PMID: 33497352

PMCID: 7879725

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.