Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Jul 22, 2023
Date Accepted: Jan 2, 2025

The final, peer-reviewed published version of this preprint can be found here:

Investigating Reddit Data on Type 2 Diabetes Management During the COVID-19 Pandemic Using Latent Dirichlet Allocation Topic Modeling and Valence Aware Dictionary for Sentiment Reasoning Analysis: Content Analysis

Nagpal M, Jalali N, Sherifali D, Morita P, Cafazzo J

Investigating Reddit Data on Type 2 Diabetes Management During the COVID-19 Pandemic Using Latent Dirichlet Allocation Topic Modeling and Valence Aware Dictionary for Sentiment Reasoning Analysis: Content Analysis

JMIR Form Res 2025;9:e51154

DOI: 10.2196/51154

PMID: 39983050

PMCID: 11870598

Managing Type 2 Diabetes during the COVID-19 Pandemic: An Investigation of Reddit Data using LDA Topic Modelling and VADER sEntiment Analysis

  • Meghan Nagpal; 
  • Niloofar Jalali; 
  • Diana Sherifali; 
  • Plinio Morita; 
  • Joseph Cafazzo

ABSTRACT

Background:

Type 2 Diabetes (T2D) is a chronic disease that can be managed in part through healthy behaviours. However, the COVID-19 pandemic impacted how people managed their condition. Using social media forums and analytics through Patient-Generated Health Data (PGHD) presents an opportunity to understand the health behaviours from the perspective of the patient.

Objective:

Our objective is to understand how the health behaviours and attitudes of people living with T2D were impacted by the early stages of the COVID-19 pandemic by examining Reddit forums (using PGHD) for people living with T2D.

Methods:

Data from the Reddit forums related to T2D from January 2018 to early March 2021 were downloaded, and Support Vector Machines (SVMs) were used to classify if a post was made in the context of the pandemic. Latent Dirichlet Allocation (LDA) topic modelling was performed to gather topics of discussion amongst the entire dataset and a subsequent iteration was performed to gather topics of discussion specific to the COVID-19 pandemic. Sentiment Analysis using the Valence Aware Dictionary for sEntiment Reasoning (VADER )algorithm was performed to gauge attitudes towards the pandemic.

Results:

Of all posts, topics of discussion were classified into themes of Managing Lifestyle, Managing Blood Glucose, Obtaining Diabetes Care, and Coping & Receiving Support. Amongst the COVID-specific posts topics of discussion were Coping with Poor Mental Health, Accessing Doctor & Medications and Controlling Blood Glucose, Changing Food Habits during Pandemic, Impact of Stress of Blood Glucose Levels, Changing Status of Employment & Insurance, Risk of COVID Complications. Overall, posts classified as COVID-related had were associated with lower sentiment than those classified as “noncovid.”

Conclusions:

Topics of discussion gauged from the Reddit forums provide a holistic perspective of the impact of the pandemic on people living with T2D. Overall, the early stages of the pandemic negatively impacted the attitudes of people living with T2D.


 Citation

Please cite as:

Nagpal M, Jalali N, Sherifali D, Morita P, Cafazzo J

Investigating Reddit Data on Type 2 Diabetes Management During the COVID-19 Pandemic Using Latent Dirichlet Allocation Topic Modeling and Valence Aware Dictionary for Sentiment Reasoning Analysis: Content Analysis

JMIR Form Res 2025;9:e51154

DOI: 10.2196/51154

PMID: 39983050

PMCID: 11870598

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.