Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Mar 27, 2024
Date Accepted: Oct 16, 2024

The final, peer-reviewed published version of this preprint can be found here:

Applying Natural Language Processing Techniques to Map Trends in Insomnia Treatment Terms on the r/Insomnia Subreddit: Infodemiology Study

Cummins JA, Gottlieb DJ, Sofer T, Wallace DA

Applying Natural Language Processing Techniques to Map Trends in Insomnia Treatment Terms on the r/Insomnia Subreddit: Infodemiology Study

J Med Internet Res 2025;27:e58902

DOI: 10.2196/58902

PMID: 39786862

PMCID: 11757973

Applying Natural Language Processing (NLP) Techniques to Map Trends in Insomnia Treatment Terms on the r/insomnia Subreddit

  • Jack A Cummins; 
  • Daniel J Gottlieb; 
  • Tamar Sofer; 
  • Danielle A Wallace

ABSTRACT

Background:

Online communities and message boards, such as Reddit, serve as a space where people can share their health-related experiences and treatments, including the experience of insomnia. Natural language processing (NLP) tools can be leveraged to understand the terms that are used in online spaces to discuss insomnia and insomnia treatments.

Objective:

To understand the language used in discussing insomnia on online message boards and evaluate trends in discussion of treatments over time.

Methods:

We performed an NLP analysis of the r/insomnia subreddit. Using the Pushshift API, we obtained all subreddit comments from the history of r/insomnia up to the end of 2022. A Bag of Words model was used to identify the top 1,000 most frequently used terms, which were manually reduced to 35 terms related to treatment and medication use. Regular expressions analysis (RE) was used to identify and count comments containing specific words, and sentiment analysis was used to predict the sentiment of comments. Data from 2013 to 2022 were visually examined for trends.

Results:

There were 340,130 comments on r/insomnia from the beginning of the subreddit to 2022. Of the 35 top treatment and medication terms that were identified, melatonin, Cognitive Behavioral Therapy for insomnia (CBT-I), and Ambien were the most frequently used. When frequency of individual terms was compared over time, terms related to CBT-I increased over time and terms related to non-prescription over-the-counter sleep aids (such as Benadryl) decreased over time. Terms with the most positive sentiment included “hygiene”, “valerian”, and “CBT”.

Conclusions:

Individuals use Reddit r/insomnia to discuss insomnia treatment. This analysis suggests that language related to CBT-I (with a spike in 2017 following recommendations by the American College of Physicians), benzodiazepines, trazadone, and antidepressant medication use has increased from 2013 to 2022. Over-the-counter or alternative therapies, such as melatonin, cannabidiol, and marijuana, have also exhibited fluctuations over time. These findings indicate frequency and trends in the use of language surrounding treatment and medication terms that may be useful to practitioners and researchers alike.


 Citation

Please cite as:

Cummins JA, Gottlieb DJ, Sofer T, Wallace DA

Applying Natural Language Processing Techniques to Map Trends in Insomnia Treatment Terms on the r/Insomnia Subreddit: Infodemiology Study

J Med Internet Res 2025;27:e58902

DOI: 10.2196/58902

PMID: 39786862

PMCID: 11757973

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.