Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jan 24, 2020
Date Accepted: Jun 1, 2020
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Depression risk prediction for Chinese microblogs via deep learning methods
ABSTRACT
Background:
Depression becomes a serious personal and public mental health problem nowadays. However, it is not easy to discover patients with depression as they cannot disclose or discuss their mental health conditions with others easily. Most work is based on self-report, which is time-consuming, and usually miss a certain quantity of cases. Therefore, automatic discovering patients with depression from other sources, such as social media, attracts more and more attention. Social media as one of the most important daily communication systems connect a large quantities of people including depression patients, and provides one channel to discover depression patients. In this paper, we investigate deep learning methods on depression risk prediction for Chinese microblogs, which has potential for discovering patients with depression and tracing their mental health conditions.
Objective:
The aim of this study is to explore the performance of some state-of-the-art deep learning methods on depression risk prediction for Chinese microblogs.
Methods:
Deep learning methods with pretrained language representation models, including BERT, RoBERTa and XLENT, are investigated and evaluated on an annotated depression benchmark dataset collected from Weibo for depression risk prediction. We also compare the different deep learning methods in two settings: 1) using publicly released pretrained language representation models directly, and 2) further pretraining language representation models from 1) on a large-scale unlabeled dataset collected from Weibo. Precision, recall and F1 score are performance evaluation measures.
Results:
The deep learning methods achieves the best macro F1 score of 0.547, a new benchmark result over the depression benchmark dataset. Specially, the deep learning methods achieves the highest macro precision of 0.536, recall of 0.376 and F1 score of 0.420 on microblogs of depression risk.
Conclusions:
In this study, we use deep learning methods with pretrained language representation models to predict depression risk for Chinese microblogs automatically. The experimental results show that the deep learning methods perform better than previous methods and have greater potential to discover patients with depression and trace their mental health conditions.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.