Accepted for/Published in: JMIR Mental Health
Date Submitted: Dec 5, 2024
Date Accepted: Mar 30, 2025
Can Chatbots Offer What Therapists Do? A mixed methods comparison between responses from therapists and LLM-based chatbots
ABSTRACT
Background:
Consumers are increasingly using LLM-based chatbots to seek mental health advice or intervention due to ease of access and limited availability of mental health professionals. However, their suitability and safety for mental health applications remain underexplored, particularly in comparison to professional therapeutic practices.
Objective:
This study aimed to evaluate how general-purpose chatbots respond to mental health scenarios and compare their responses to those provided by licensed therapists. Specifically, we sought to identify chatbots’ strengths, limitations, and the ethical and practical considerations necessary for their use in mental health care.
Methods:
We conducted a mixed methods study to compare responses from chatbots and licensed therapists to scripted mental health scenarios. We created two fictional scenarios and prompted 3 chatbots to create six interaction logs. We recruited 17 therapists and conducted study sessions that consisted of three activities. First, therapists responded to the two scenarios using a Qualtrics form. Second, therapists went through the six interaction logs using a think aloud procedure to highlight their thoughts about the chatbots’ responses. Lastly, we conducted a semi-structured interview to explore subjective opinions on the use of chatbots for supporting mental health. The study sessions were analyzed using thematic analysis. The interaction logs from chatbot and therapist responses were coded using the Multitheoretical List of Therapeutic Interventions codes and then compared to each other.
Results:
We identified seven themes describing the strengths and limitations of the chatbots as compared to therapists. These include elements of good therapy in chatbot’s responses, conversational style of chatbots, insufficient inquiry and feedback seeking by chatbots, chatbot interventions, client engagement, chatbots’ responses to crisis situations, and considerations for chatbot-based therapy. In the use of MULTI codes, we found that therapists evoked more elaboration (t = 4.50, p = 0.001) and employed more self-disclosure (t = 1.05, p = 0.31) as compared to the chatbots. The chatbots used affirming (t = 1.71, p = 0.10) and reassuring (t = 2.29, p = 0.03) language more often than the therapists. The chatbots also employed psychoeducation (t = 2.69, p = 0.01) and suggestions (t = 4.23, p = 0.001) more often than the therapists did.
Conclusions:
Our study demonstrates the unsuitability of general purpose chatbots to safely engage in mental health conversations, particularly in crisis situations. While chatbots display elements of good therapy, such as validation and reassurance, overuse of directive advice without sufficient inquiry, and use of generic interventions makes them unsuitable as therapeutic agents. Careful research and evaluation will be necessary to determine the impact of chatbot interactions and to identify the most appropriate use cases related to mental health.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.