Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: May 28, 2025
Date Accepted: Oct 8, 2025
Date Submitted to PubMed: Oct 10, 2025

The final, peer-reviewed published version of this preprint can be found here:

The Efficacy of Rule-Based Versus Large Language Model-Based Chatbots in Alleviating Symptoms of Depression and Anxiety: Systematic Review and Meta-Analysis

Du Q, Ren Y, Meng Zl, He H, Meng S

The Efficacy of Rule-Based Versus Large Language Model-Based Chatbots in Alleviating Symptoms of Depression and Anxiety: Systematic Review and Meta-Analysis

J Med Internet Res 2025;27:e78186

DOI: 10.2196/78186

PMID: 41073272

PMCID: 12677872

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Large Language Model-Based Chatbots for Depression and Anxiety: Systematic Review and Meta-Analysis

  • Qiuxue Du; 
  • Yongliang Ren; 
  • Ze-long Meng; 
  • Han He; 
  • Shasha Meng

ABSTRACT

Background:

The global mental health crisis is becoming increasingly severe, with over 280 million patients suffering from depression and over 301 million patients suffering from anxiety disorders. Due to the shortage of mental health professionals, high treatment costs, and insufficient accessibility of services, there is an urgent need for scalable and low-cost intervention methods. Chatbots based on Large Language Models (LLMs) have become a new tool for providing psychological support with advanced natural language processing and deep learning techniques. However, their therapeutic effects are not yet fully studied, especially in terms of differential effects on depression and anxiety. There are obvious limitations to existing research: the psychological assessment tools used in different studies are mixed (e.g., PHQ-9, GAD-7, STAI), and there are significant differences in intervention design (dialogue system architecture, intervention duration), which makes it difficult to directly compare the efficacy evaluation results, especially for the differentiated effects on depression and anxiety.

Objective:

This study systematically evaluates the effectiveness of LLM chatbots in alleviating symptoms of depression and anxiety, and analyzes the moderating effects of intervention duration, control group type (e.g., blank control vs. traditional therapy), and demographic characteristics (e.g., age).

Methods:

By systematically searching PubMed, Cochrane, Scopus, and CNKI, 10 studies (17 effect measures) published between 2020 and 2025 were included, including randomized controlled trials (RCTs) and pre post test designs. The research subjects are adults aged 16 and above with symptoms of depression or anxiety. Using Hedges' g as the measure of effect, a random effects model was employed due to moderate heterogeneity (depression I² = 35.6%, anxiety I² = 54.96%). Subgroup analysis explored the effects of control group type, intervention duration, and age.

Results:

Summary analysis shows that chatbots have a marginal significant improvement effect on depression symptoms (g = 0.157, 95% CI [-0.013, 0.328], p = 0.071), and a significant improvement effect on anxiety symptoms (g = 0.277, 95% CI [0.069, 0.486], p = 0.009). Subgroup analysis found that for depressive symptoms, the effect size of the blank control group (g = 0.309) was higher than that of the traditional therapy group (g = 0.193); For anxiety symptoms, long-term intervention (> 4 weeks) has a significant effect (g = 0.572), far superior to short-term intervention (g = 0.063); Elderly people (> 50 years) benefit more (g = 0.372).

Conclusions:

LLM chatbots have clinical significance for anxiety, especially in the elderly and long-term interventions, but their improvement on depression is limited. Future research should optimize depression intervention designs, integrate multimodal therapies, and focus on long-term effectiveness.


 Citation

Please cite as:

Du Q, Ren Y, Meng Zl, He H, Meng S

The Efficacy of Rule-Based Versus Large Language Model-Based Chatbots in Alleviating Symptoms of Depression and Anxiety: Systematic Review and Meta-Analysis

J Med Internet Res 2025;27:e78186

DOI: 10.2196/78186

PMID: 41073272

PMCID: 12677872

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.