Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Mar 15, 2019
Open Peer Review Period: Mar 18, 2019 - May 13, 2019
Date Accepted: Feb 26, 2020
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Understanding Weight Loss via Online Discussions: Content Analysis of Reddit Posts Using Topic Modeling and Word Clustering Techniques

Liu Y, Yin Z

Understanding Weight Loss via Online Discussions: Content Analysis of Reddit Posts Using Topic Modeling and Word Clustering Techniques

J Med Internet Res 2020;22(6):e13745

DOI: 10.2196/13745

PMID: 32510460

PMCID: 7308899

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Understanding Weight Loss via Online Discussions: Content Analysis of Reddit Posts Using Topic Modeling and Word Clustering Techniques

  • Yang Liu; 
  • Zhijun Yin

Background:

Maintaining a healthy weight can reduce the risk of developing many diseases, including type 2 diabetes, hypertension, and certain types of cancers. Online social media platforms are popular among people seeking social support regarding weight loss and sharing their weight loss experiences, which provides opportunities for learning about weight loss behaviors.

Objective:

This study aimed to investigate the extent to which the content posted by users in the r/loseit subreddit, an online community for discussing weight loss, and online interactions were associated with their weight loss in terms of the number of replies and votes that these users received.

Methods:

All posts that were published before January 2018 in r/loseit were collected. We focused on users who revealed their start weight, current weight, and goal weight and were active in this online community for at least 30 days. A topic modeling technique and a hierarchical clustering algorithm were used to obtain both global topics and local word semantic clusters. Finally, we used a regression model to learn the association between weight loss and topics, word semantic clusters, and online interactions.

Results:

Our data comprised 477,904 posts that were published by 7660 users within a span of 7 years. We identified 25 topics, including food and drinks, calories, exercises, family members and friends, and communication. Our results showed that the start weight (β=.823; P<.001), active days (β=.017; P=.009), and median number of votes (β=.263; P=.02), mentions of exercises (β=.145; P<.001), and nutrition (β=.120; P<.001) were associated with higher weight loss. Users who lost more weight might be motivated by the negative emotions (β=−.098; P<.001) that they experienced before starting the journey of weight loss. In contrast, users who mentioned vacations (β=−.108; P=.005) and payments (β=−.112; P=.001) tended to experience relatively less weight loss. Mentions of family members (β=−.031; P=.03) and employment status (β=−.041; P=.03) were associated with less weight loss as well.

Conclusions:

Our study showed that both online interactions and offline activities were associated with weight loss, suggesting that future interventions based on existing online platforms should focus on both aspects. Our findings suggest that online personal health data can be used to learn about health-related behaviors effectively.


 Citation

Please cite as:

Liu Y, Yin Z

Understanding Weight Loss via Online Discussions: Content Analysis of Reddit Posts Using Topic Modeling and Word Clustering Techniques

J Med Internet Res 2020;22(6):e13745

DOI: 10.2196/13745

PMID: 32510460

PMCID: 7308899

The author of this paper has made a PDF available, but requires the user to login, or create an account.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.