JMIR Preprints #79838: Evaluating Generative Artificial Intelligence Psychotherapy Chatbots Used by Youth: A Cross-Sectional Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Evaluating Generative Artificial Intelligence Psychotherapy Chatbots Used by Youth: A Cross-Sectional Study

Kunmi Sobowale;
Daniel Kevin Humphrey;
Sophia Yingruo Zhao

ABSTRACT

Background:

Many youth rely on direct-to-consumer generative artificial intelligence (GenAI) chatbots for mental health support, yet the quality of the psychotherapeutic capabilities of these chatbots is understudied.

Objective:

We sought to comprehensively evaluate and compare the quality of widely used GenAI chatbots with psychotherapeutic capabilities.

Methods:

In this cross-sectional study, trained raters used an evaluation framework to rate the quality of five chatbots from GenAI platforms widely used by youth. Trained raters roleplayed as youth using personas of youth with mental health challenges to prompt chatbots, facilitating conversations. Chatbot responses were generated from August to October 2024. The primary outcomes were rated scores in nine sections. The proportion of high-quality ratings (binary rating of 1) across each section was compared between chatbots using Bonferroni-corrected χ2 tests.

Results:

While GenAI chatbots were found to be accessible (104 high-quality ratings [87%]) and avoid harmful statements and misinformation (71 of 80 [89%]), they performed poorly in their therapeutic approach (14 of 45 [35%]) and their ability to monitor and assess risk (31 of 80 [39%]). Information on chatbot model training and knowledge was unavailable, resulting in low scores. Bonferroni-corrected χ2 tests showed statistically significant differences in chatbot quality in the background, therapeutic approach, and monitoring and risk evaluation sections. Qualitatively, raters perceived most chatbots as having strong conversational abilities but found them plagued by various issues, including fabricated content and poor handling of crisis situations.

Conclusions:

Overall, direct-to-consumer GenAI chatbots showed mixed results in terms of quality, suggesting potential for harm and demonstrating a greater need for transparency and oversight. These findings may enable youth and other stakeholders to make informed decisions about using chatbots for mental health support.

Citation

Please cite as:

Sobowale K, Humphrey DK, Zhao SY

Evaluating Generative AI Psychotherapy Chatbots Used by Youth: Cross-Sectional Study

JMIR Ment Health 2025;12:e79838

DOI: 10.2196/79838

PMID: 41370787

PMCID: 12694945

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Mental Health

Date Submitted: Jun 29, 2025

Open Peer Review Period: Jun 29, 2025 - Aug 24, 2025

Date Accepted: Sep 29, 2025

(closed for review but you can still tweet)

Evaluating Generative Artificial Intelligence Psychotherapy Chatbots Used by Youth: A Cross-Sectional Study

ABSTRACT

Citation

Copyright