Accepted for/Published in: JMIR Infodemiology
Date Submitted: Dec 24, 2020
Date Accepted: Mar 17, 2021
Date Submitted to PubMed: Mar 19, 2021
Monitoring Depression Trend on Twitter during the COVID-19 Pandemic: Observational Study
ABSTRACT
Background:
The COVID-19 pandemic has severely affected people’s daily lives and caused tremendous economic loss worldwide. Anecdotal evidence suggests that the pandemic has increased the depression level among the population. However, systematic studies of depression detection and monitoring during the depression are lacking.
Objective:
This study aims (1) to develop a method to accurately identify people with depression by analyzing their tweets and (2) to monitor the population-wise depression level on Twitter.
Methods:
To study this subject, we design an effective regular expression-based search method and create by far the largest English Twitter depression dataset containing 2,575 distinct identified depression users (N=2,575) with their past tweets. To examine the effect of depression on people’s Twitter language, we train three transformer-based depression classification models on the dataset, evaluate their performance with progressively increased training sizes, and compare the model’s “tweet chunk”-level and user-level performances. Furthermore, inspired by psychological studies, we create a fusion classifier that combines deep learning model scores with psychological text features and users’ demographic information and investigate these features’ relations to depression signals. Finally, we demonstrate our model’s capability of monitoring both group-level and population-level depression trends by presenting two of its applications during the COVID-19 pandemic.
Results:
Our fusion model demonstrates an accuracy of 78.9% on a test set containing 446 people (N=446), half of which are identified as suffering from depression. Conscientiousness, neuroticism, appearance of first-person pronouns, talking about biological processes such as eat and sleep, talking about power, and exhibiting sadness are shown to be important features in depression classification. Further, when used for monitoring the depression trend, our model shows that depressive users, in general, respond to the pandemic later than the control group based on their tweets. It is also shown that three states of the United States - New York (NY), California (CA), and Florida (FL) - share a similar depression trend as the whole US population. When compared to NY and CA, people in FL demonstrate a significantly lower level of depression.
Conclusions:
This study proposes an efficient method that can be used to analyze the depression level of different groups of people on Twitter. We hope this study can raise awareness among researchers and the general public of COVID-19’s impact on people’s mental health. The non-invasive monitoring system can also be rapidly adapted to other big events besides COVID-19 and might be useful during future outbreaks.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.