Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Apr 1, 2024
Date Accepted: Dec 11, 2024

The final, peer-reviewed published version of this preprint can be found here:

Text-Based Depression Prediction on Social Media Using Machine Learning: Systematic Review and Meta-Analysis

Phiri D, Makowa F, Amelia VL, Phiri YVA, Dlamini LP, Chung MH

Text-Based Depression Prediction on Social Media Using Machine Learning: Systematic Review and Meta-Analysis

J Med Internet Res 2025;27:e59002

DOI: 10.2196/59002

PMID: 40215481

PMCID: 12032503

Text-Based Depression Prediction on Social Media Using Machine Learning: A Systematic Review and Meta-Analysis

  • Doreen Phiri; 
  • Frank Makowa; 
  • Vivi Leona Amelia; 
  • Yohane Vincent Abero Phiri; 
  • Lindelwa Portia Dlamini; 
  • Min-Huey Chung

ABSTRACT

Background:

Studies on the link between social media and mental health have primarily focused on depression. However, few review studies have been conducted, and most of them did not focus on depression, searched only a few databases, only analyzed language features, and did not apply machine learning.

Objective:

This systematic review and meta-analysis evaluated the effect of social media texts overall and of demographic, linguistic, activity, and temporal features in predicting depression through machine learning.

Methods:

We searched articles from 11 databases from inception to August 2023. Data analysis was performed using Comprehensive Meta-Analysis version 3. We used a random-effects model to pool the effect sizes with 95% CIs. Study heterogeneity was evaluated using forest plots and P values in the Cochran Q test. Moderator analysis was performed to identify the sources of heterogeneity. We used the Begg–Mazumdar rank correlation and Egger’s tests to assess publication bias, and we performed a sensitivity analysis using the leave-one-out method to determine result stability.

Results:

We included 36 studies on machine learning–based depression prediction using social media textual data. We observed a significant overall correlation between social media texts and depression, with a large effect size (r = 0.630, 95% CI: 0.565–0.686). We noted the same correlation and large effect size for demographic (largest effect size; r = 0.642, 95% CI: 0.489-0.757), activity (r = 0.552, 95% CI: 0.418-0.663), linguistic (r = 0.545, 95% CI: 0.441-0.649), and temporal features (r = 0.531, 95% CI: 0.320-0.693). The social media platform type (public or private, P < .001), machine learning approach (shallow or deep, P = .048), and use of outcome measures (yes or no, P < .001) were significant moderators. Sensitivity analysis revealed no change in the results, indicating result stability. The Begg–Mazumdar rank correlation (Kendall’s Tau b = 0.22063, P = .058) and Egger’s test (t = 1.28696, P = .207) confirmed the absence of publication bias.

Conclusions:

Social media textual content can be a useful tool for predicting depression. To maximize the accuracy of depression prediction models, it is important to consider demographic, linguistic, and social media platform activity as well as temporal features. Additionally, the effects of social media platform type, machine learning approach, and use of outcome measures in depression prediction models need attention. Clinical Trial: PROSPERO, registration number CRD42023427707


 Citation

Please cite as:

Phiri D, Makowa F, Amelia VL, Phiri YVA, Dlamini LP, Chung MH

Text-Based Depression Prediction on Social Media Using Machine Learning: Systematic Review and Meta-Analysis

J Med Internet Res 2025;27:e59002

DOI: 10.2196/59002

PMID: 40215481

PMCID: 12032503

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.