Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Dec 30, 2022
Date Accepted: Mar 16, 2023

The final, peer-reviewed published version of this preprint can be found here:

Disruptions in the Cystic Fibrosis Community’s Experiences and Concerns During the COVID-19 Pandemic: Topic Modeling and Time Series Analysis of Reddit Comments

Yao LFL, Ferawati K, Liew K, Wakamiya S, Aramaki E

Disruptions in the Cystic Fibrosis Community’s Experiences and Concerns During the COVID-19 Pandemic: Topic Modeling and Time Series Analysis of Reddit Comments

J Med Internet Res 2023;25:e45249

DOI: 10.2196/45249

PMID: 37079359

PMCID: 10160941

The Disruption of the Cystic Fibrosis Community’s Experiences and Concerns during the COVID-19 Pandemic: Topic Modeling and Time Series Analysis of Reddit Comments

  • Lean Franzl Lim Yao; 
  • Kiki Ferawati; 
  • Kongmeng Liew; 
  • Shoko Wakamiya; 
  • Eiji Aramaki

ABSTRACT

Background:

The use of social media rose significantly during the COVID-19 pandemic, where people were using it to communicate and share information amidst pandemic disruptions. Rare disease patients have been utilizing social media as an information network since before the pandemic, providing valuable insight into patients' experiences in everyday life. One platform that proved relevant is Reddit. We thus examine the experiences of patients suffering from Cystic Fibrosis (CF), who face more vulnerability in the pandemic, given the overlap of symptoms. We look at the impact of COVID-19 on the discussion topics of CF patients.

Objective:

This study aims to identify the effect of COVID-19 on the discussion topics of the r/CysticFibrosis subreddit. We applied BERTopic models on posts and comments, and performed a time series analysis to identify topics that concerned COVID-19 pandemic disruption.

Methods:

We used the Pushshift Reddit API to scrape all comments from the subreddit r/CysticFibrosus until 31 August 2022. We removed duplicate comments, links, tags, and mentions of other users before applying a BERTopic model. We reduced the number of topics to a more manageable size of 22. We fitted an Autoregressive Integrated Moving Average (ARIMA) model for the denoised dataset, without considering the topics, and also for the subsetted data for each of the 22 topics. We assigned a dummy variable to indicate the COVID-19 pandemic period, which we specified as the months of 2020 and controlled for the effects of the number of authors to examine topical changes before and after this time point.

Results:

We collected 120,738 comments from 5,827 unique user IDs from 24 March 2011 until 31 August 2022. After fitting the BERTopic model and excluding outliers and noise, we were left with 42,060 comments categorized into 22 topics. The significance testing of the COVID-19 dummy variable resulted in a mix of positive and negative effects for the various topics.

Conclusions:

COVID-19, overall, had a negative effect on the number of comments in the subreddit r/CysticFibrosis. The mix of positive and negative effects of the COVID-19 dummy variable among the different BERTopic topics indicates a shift in discussion topics. We found that topics discussing medications like Trikafta and Tobramycin, lung transplants and respiration, gratitude, sweat testing, mutations, medical facilities, and inheritance of CF had decreased activity, while the topic discussing marijuana had increased activity during the COVID-19 pandemic.


 Citation

Please cite as:

Yao LFL, Ferawati K, Liew K, Wakamiya S, Aramaki E

Disruptions in the Cystic Fibrosis Community’s Experiences and Concerns During the COVID-19 Pandemic: Topic Modeling and Time Series Analysis of Reddit Comments

J Med Internet Res 2023;25:e45249

DOI: 10.2196/45249

PMID: 37079359

PMCID: 10160941

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.