Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Sep 30, 2025
Date Accepted: Mar 24, 2026
Large Language Model–Based Analysis of Statin Therapy Discussions and Sentiment on Social Media: Cross-Sectional Observational Study
ABSTRACT
Background:
Statin therapy, despite its proven cardiovascular benefits, remains underused, and social media platforms may reveal patient perspectives not captured in clinical settings.
Objective:
To characterize themes, sentiment, and decision-making factors related to statin therapy through large language model (LLM)–based analysis of Reddit discussions.
Methods:
This cohort study analyzed English-language Reddit posts and comments about statins from January 2022 to May 2025, identified via keyword-based Reddit application programming interface (API) searches (≤1,000 posts per keyword). A total of 5,328 discussions from public statin- and cholesterol-focused communities were included. Self-reported experiences with statin therapy in posts and comments containing terms for specific statins or cholesterol management were examined. Thematic groups, sentiment (positive, neutral, negative), clinical relevance, information-seeking behavior, side-effect reporting, decision factors, and adherence issues were extracted using an LLM-based pipeline.
Results:
Of 5,328 discussions (1,661 posts, 3,667 comments), thematic analysis identified key topics: side-effect concerns (31.9%), decision-making based on laboratory results and physician advice (51.9% and 38.2%, respectively), information requests (46.1%), and alternative therapies (46.6%). Overall sentiment was neutral in 34.0% (95% CI, 32.7-35.3), negative in 30.9% (95% CI, 29.7-32.1), and positive in 16.9% (95% CI, 15.9-17.9); the remainder were mixed. Statin-specific sentiment was neutral in 44.1% (95% CI, 42.7-45.5), negative in 25.2% (95% CI, 24.0-26.4), and positive in 12.5% (95% CI, 11.6-13.4); the remainder did not mention sentiment about statins. High clinical relevance was identified in 12.6% (95% CI, 11.7-13.5) of discussions. Adherence issues were reported in 29.8% (95% CI, 28.6-31.0), with muscle pain (7.6% of side-effect reports) and fatigue (6.5% of side-effect reports) as common side effects.
Conclusions:
LLM-enabled analysis of Reddit discussions reveals significant negative sentiment, adherence challenges, and side-effect concerns regarding statin therapy, highlighting opportunities for targeted patient education and shared decision-making to improve cardiovascular disease prevention.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.