Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Mar 15, 2019
Open Peer Review Period: Mar 18, 2019 - May 13, 2019
Date Accepted: Feb 26, 2020
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Understanding Weight Loss via Online Discussions: Content Analysis of Reddit Posts Using Topic Modeling and Word Clustering Techniques

Liu Y, Yin Z

Understanding Weight Loss via Online Discussions: Content Analysis of Reddit Posts Using Topic Modeling and Word Clustering Techniques

J Med Internet Res 2020;22(6):e13745

DOI: 10.2196/13745

PMID: 32510460

PMCID: 7308899

“I Hit Onederland Today!!!”: Learning Factors Associated with Weight Loss via Online Discussions

  • Yang Liu; 
  • Zhijun Yin

ABSTRACT

Background:

Maintaining a healthy weight can low the risk of developing many health issues, including type II diabetes, high blood pressures and certain types of cancers. Online social media platforms are popular for people to seek and share their weight loss experiences, which provides opportunities to learn about weight loss behaviors.

Objective:

To discover the topics conveyed in online discussions on weight loss and investigate the extent to which the factors disclosed in these discussions are associated with weight loss.

Methods:

We collected posts that were published before January 13, 2018 in r/loseit subreddit, an online discussion board for people to discuss weight loss related topics. We focused on users who mentioned their start weight, current weight and goal weight within at least 30 days. We applied a topic modeling technique to obtain the main themes that were discussed in this subreddit. We further applied trend analysis to discover the temporal trend of topic prevalence. In order to gain insights into the detailed potential factors that were related to weight loss, we applied hierarchical clustering to obtain word semantic clusters. Finally, we applied regression model to learn the association between topics, word semantic clusters and weight loss reported in this subreddit. The predictors that were statistically significant at a level of 0.05 were reported.

Results:

Our data consisted of 477,904 posts that were published by 7,660 users within a period of 7 years. We found that the number of posts published in this forum raised rapidly after 2016. Through applying Latent Dirichlet Allocation, we identified 25 topics that were mainly discussed in online forums, including drinking, calories, exercises, family members, and lifestyles. Our regression analysis showed that the start weight (=0.823, P<0.001), live days (=0.017, P=0.009), and the number of votes (=0.263, P=0.016) received by a user were positively associated with weight loss. Further, our findings suggested that exercises (e.g., mentions of workout clothes, =0.145, P<0.001) and nutrition (=0.120, P<0.001) were the most effective factors that were associated with weight loss. Users who had higher weight loss might be motivated by the negative emotions (=-0.098, P<0.001) that they experienced in social activities before starting the journey of weight loss. By contrast, we found that users who mentioned vacations (=-0.108, P=0.005), payments (=-0.112, P=0.001) and supermarkets (=-0.060, P=0.008) tended to experience relatively lower weight loss. Further, mentions of family members (=-0.031, P=0.033) and employment status (=-0.041, P=0.033) were also found to be negatively associated with weight loss.

Conclusions:

Our findings indicated that what people disclosed in online discussion contains many factors that were statistically significant with respect to their weight loss. Our study contributed to the evidence that online personal health data can be effectively applied to learn about health-related behaviors.


 Citation

Please cite as:

Liu Y, Yin Z

Understanding Weight Loss via Online Discussions: Content Analysis of Reddit Posts Using Topic Modeling and Word Clustering Techniques

J Med Internet Res 2020;22(6):e13745

DOI: 10.2196/13745

PMID: 32510460

PMCID: 7308899

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.