Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Jan 3, 2024
Date Accepted: Nov 6, 2024
Understanding superusers’ and regular users’ engagement in UK Respiratory Online Health Communities and the impact of their interaction on sentiment: A BioBERT Perspective
ABSTRACT
Background:
Online Health Communities (OHCs) enable people with Long-Term Conditions (LTCs) to exchange peer self-management experiential information, advice and support. Engagement of ‘superusers’, i.e., highly active users, plays a key role in holding together the community and ensuring effective exchange of support and information. Further studies are needed to explore regular users’ interactions with superusers, their sentiment during interactions and ultimately impact on self-management of LTCs.
Objective:
The aim of this study is to gain a better understanding of sentiment distribution and the dynamic of sentiment of posts from two online respiratory communities, focusing on regular users’ interaction with superusers.
Methods:
We conducted Sentiment Analysis on anonymized data from two UK online respiratory OHCs hosted by Asthma UK (AUK), and the British Lung Foundation (BLF) charities between 2006-2016 and 2012-2016, respectively, using the Bio-Bidirectional Encoder Representation from Transformers (BioBERT), a pre-trained language representation model. Given the scarcity of health-related labelled datasets, BioBERT was fine-tuned on the Covid-19 Twitter Dataset. Positive, neutral, and negative sentiment were categorized as 1, 0, and −1, respectively. The average sentiment of regular users’ and superusers’ aggregated posts was then calculated. Superusers were identified based on a definition already employed in our previous work (i.e., ‘the 1% users with the largest number of posts over the observation period’) and VoteRank, i.e., users with the best spreading ability. Sentiment Analysis of posts by superusers defined with both approaches was analyzed for correlation.
Results:
The fine-tuned BioBERT model achieved an accuracy of 0.96. The sentiment of posts was predominantly positive (60% and 65% of overall posts in AUK and BLF, respectively), remaining stable over the years. Furthermore, there was a tendency for sentiment to become more positive over time. Overall, superusers tended to write shorter posts characterized by positive sentiment (63% and 67% of all posts in AUK and BLF, respectively). Superusers defined by posting activity or VoteRank largely overlapped (61% in AUK and 79% in BLF), showing that users who posted the most were also spreaders. Threads initiated by superusers typically encouraged regular users to reply with positive sentiment. Superusers tended to write positive replies in threads started by regular users whatever the type of sentiment of the starting post (i.e., positive, neutral, or negative), compared to the replies by other regular users (62%, 51%, 61% versus 55%, 45%, 50% in AUK; 71%, 62%, 64% versus 65%, 56%, 57% in BLF, respectively, P < .001, except for neutral sentiment in AUK, where P = .36).
Conclusions:
Network and Sentiment Analysis provide insight into the key sustaining role of superusers in respiratory OHCs, showing they tend to write and trigger regular users’ posts characterized by positive sentiment.
Citation
Per the author's request the PDF is not available.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.