Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: May 11, 2023
Open Peer Review Period: May 11, 2023 - Jul 6, 2023
Date Accepted: Apr 18, 2024
(closed for review but you can still tweet)
Identifying Reddit Users at a High Risk of Suicide and Their Linguistic Features During the COVID-19 Pandemic: A Growth-Based Trajectory Model
ABSTRACT
Background:
Suicide has become a serious public health issue during the coronavirus disease 2019 (COVID-19) pandemic. Due to social distancing, social media has provided a platform for suicidal individuals to post their thoughts and behaviors. However, current suicide studies using social media data have failed to recognize users’ heterogeneity and the temporal nature of suicide risk.
Objective:
By examining the variations in the trajectories of post volumes among users on the r/SuicideWatch subreddit during the COVID-19 pandemic, we aimed to investigate the heterogenous patterns of change in suicide risk to help identify social media users at a high risk of suicide. We also characterized their linguistic features before and during the pandemic.
Methods:
We collected and analyzed post data every half year from March 2019 to September 2022 among users on the r/SuicideWatch subreddit (N = 6,163). A growth-based trajectory model was then used to investigate the trajectories of post volumes for identifying patterns of change in suicide risk during the pandemic. Trends in linguistic features within posts were also charted and compared, and linguistic markers were identified across the trajectory groups using regression analysis.
Results:
We had identified two distinct trajectories of post volume among r/SuicideWatch subreddit users. A small proportion of users (12.08%) were labeled as at a high risk of suicide, with a sharp and lasting increase in post volume during the pandemic, while the majority of users (87.92%) were categorized as being at a low risk of suicide, with a consistently low and mild increase in post volume during the pandemic. In terms of the frequency of most linguistic features, both groups showed increases at the initial stage of the pandemic; afterward, the rising trend continued in the high-risk group before declining, while the low-risk group showed an immediate decrease. One year after the pandemic outbreak, two groups showed differences in their referencing of several words within the categories of personal pronouns, affective, social, cognitive, perceptual, biological processes, drives, relativity, time orientations, and personal concerns. Particularly, the high-risk group was discriminant in using words related to anger, sadness, feelings, health, motion, and death during this stage.
Conclusions:
Based on the two identified trajectories of post volume during the pandemic, this study divided users in the r/SuicideWatch subreddit into suicide high- and low-risk groups. Our findings indicated heterogeneous patterns of change in suicide risk in response to the pandemic. The high-risk group also demonstrated distinct linguistic features. We recommend conducting real-time surveillance of suicide risk using social media data during future public health crises to provide timely support to individuals at a potentially high risk of suicide. Clinical Trial: NA
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.