Accepted for/Published in: JMIR Mental Health
Date Submitted: May 3, 2020
Date Accepted: Mar 31, 2021
Date Submitted to PubMed: Aug 12, 2021
Depression Detection of Twitter Posters using Deep Learning with Anaphora Resolution: Algorithm Development and Validation
ABSTRACT
Background:
Mental health problems are widely recognized as a major public health challenge worldwide. This highlights the need for effective tools for detecting mental health disorders in the population. Social network data is a promising source of data where patients publish rich personal information that can be mined to extract valuable psychological cues. However, social media data has its own set of challenges, such as the need to disambiguate between statements about oneself and about third parties. Traditionally, social media NLP techniques have looked at text classifiers and user classification models separately. This presents a challenge for researchers wanting to combine text sentiment and user sentiment analysis.
Objective:
The objective of this study is to develop a predictive model capable of detecting users with depression from Twitter posts and instantly highlighting their generated textual content. The model can address the problem of anaphoric resolution and highlight anaphoric interpretations.
Methods:
The dataset was retrieved from Twitter by using a regular expression or stream of real-time tweets, comprising 3,682 users, 1,983 of them self-declared as depressed and 1,699 with no declared depression. Two multiple instance learning (MIL) models were developed, with and without anaphoric resolution encoder, to predict users with depression and highlight posts related to mental health of the author. Several previously published models were replicated to our data set and compared results with our models.
Results:
The maximum accuracy, F1-score, and area under the curve (AUC) of our proposed anaphoric resolution model are 92%, 92%, and 90%, respectively. The model outperformed alternative predictive models, ranging from classical machine learning to deep learning techniques.
Conclusions:
Our proposed model with anaphoric resolution shows promising results relative to other predictive models and provides valuable insights into textual content relevant to the poster rather than a third party.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.