Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Infodemiology

Date Submitted: Jan 31, 2022
Date Accepted: Sep 15, 2022

The final, peer-reviewed published version of this preprint can be found here:

Perspectives of the COVID-19 Pandemic on Reddit: Comparative Natural Language Processing Study of the United States, the United Kingdom, Canada, and Australia

Hu M, Conway M

Perspectives of the COVID-19 Pandemic on Reddit: Comparative Natural Language Processing Study of the United States, the United Kingdom, Canada, and Australia

JMIR Infodemiology 2022;2(2):e36941

DOI: 10.2196/36941

PMID: 36196144

PMCID: 9521381

Using Reddit data to investigate perspectives on the COVID-19 pandemic using natural language processing: a comparative study of the US, the UK, Canada and Australia

  • Mengke Hu; 
  • Mike Conway

ABSTRACT

Background:

Since COVID-19 was declared a pandemic by the World Health Organization (WHO) on March 11, 2020, the disease has had an unprecedented impact worldwide, with, as of December 21, 2021, more than 276 million confirmed cases and 5.3 million deaths[1]. Social media such as Reddit can serve as a resource for enhancing situational awareness, particularly regarding monitoring public attitudes and behavior during the crisis. Insights gained can then be utilized to better understand public attitudes and behaviors during the COVID-19 crisis, and to support communication and health promotion messaging.

Objective:

With this work, we compare public attitudes towards the 2020/2021 COVID-19 pandemic across four predominantly English-speaking countries (the United States, the United Kingdom, Canada, and Australia) using data derived from the social media platform Reddit.

Methods:

We utilized a natural language processing method called topic modeling (more specifically Latent Dirichlet Allocation). Topic modeling is a popular unsupervised learning technique that can be used to automatically in- fer topics (i.e. semantically-related categories) from a large corpus of text. We derived our data from six country-specific, COVID-19-related subreddits (r/CoronavirusAustralia, r/CoronavirusDownunder, r/CoronavirusCanada, r/CanadaCoronavirus, r/CoronavirusUK, r/coronavirusus). We used topic modeling methods to investigate and compare topics of concern for each country.

Results:

From the Reddit data we found that (1) the volume of posting declined consistently across all four countries during the study period (Feb. 2020 to Nov. 2020); (2) during lockdown events, the volume of posts peaked; and (3) the UK and Australian subreddits contained much more policy discussion – and less conspiratorial content – than the US or Canadian subreddits.

Conclusions:

This work demonstrated that (a) there were key differences between salient topics discussed across the four countries, and (b) Reddit data has the potential to provide insights not readily apparent in survey-based approaches.


 Citation

Please cite as:

Hu M, Conway M

Perspectives of the COVID-19 Pandemic on Reddit: Comparative Natural Language Processing Study of the United States, the United Kingdom, Canada, and Australia

JMIR Infodemiology 2022;2(2):e36941

DOI: 10.2196/36941

PMID: 36196144

PMCID: 9521381

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.