Currently accepted at: JMIR Formative Research
Date Submitted: Dec 31, 2025
Date Accepted: Feb 27, 2026
This paper has been accepted and is currently in production.
It will appear shortly on 10.2196/90644
The final accepted version (not copyedited yet) is in this tab.
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Pilot Evaluation of a Mental Wellness Chatbot for Depression and Anxiety: User Experience and Early Clinical Outcomes from a Mixed-Methods Study
ABSTRACT
Background:
AI-powered conversational agents (i.e., chatbots) are increasingly popular outlets for users seeking psychological support, yet little is known about how users experience early-stage prototypes or which therapeutic processes contribute to clinical improvement. Transparent evaluation of emerging chatbot prototypes is needed to clarify if, how, and why AI companions work and to guide their continued development.
Objective:
This mixed-methods pilot study evaluated user experience, acceptability, and preliminary clinical signals for an early-stage mental wellness chatbot. We also examined whether baseline symptom severity moderated clinical improvement.
Methods:
Three sequential cohorts (N=125) completed a two-week, incentivized chatbot exposure (approximately 60 minutes per week). Participants provided first-impression ratings, qualitative feedback, and pre–post assessments of depressive symptoms (PHQ-8), anxiety symptoms (GAD-7), psychological distress, well-being, and loneliness. Statistical models estimated symptom change and tested interactions with baseline symptom severity. Mixed-methods analysis integrated quantitative outcomes with thematic analysis of open-ended responses.
Results:
Participants described the chatbot as accessible, easy to use, and emotionally validating, while citing limitations in personalization and conversational depth. Qualitative themes highlighted early therapeutic processes such as emotional validation, goal setting, and perceived attunement. Regression models showed significant pre–post reductions in depressive (Hedges’ g = –0.32) and anxiety (g = –0.32) symptoms, alongside modest improvements in distress and well-being. Baseline severity significantly moderated improvement: marginal-effects analyses indicated larger predicted reductions at higher PHQ-8 and GAD-7 baseline scores (e.g., PHQ-8 = 15: g = –0.84; GAD-7 = 15: g = –0.62).
Conclusions:
This pilot provides a transparent view of early chatbot development and demonstrates promising user experiences, preliminary symptom improvements, and clear mechanistic targets. Findings support continued refinement and motivate larger, longer-term trials evaluating sustained engagement, clinical durability, and performance among individuals with greater baseline severity.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.