Currently submitted to: JMIR Formative Research
Date Submitted: Mar 5, 2026
Open Peer Review Period: Mar 6, 2026 - May 1, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Development and Evaluation of a Learning-by-Teaching Framework for CBT Skill-Building using Commonly Accessible AI: A Pilot Study
ABSTRACT
Background:
Current AI interventions in mental health positions LLMs to act as therapists, raising concerns regarding simulated emotional bond and clinical safety. These systems risk patients becoming dependent on the tool, instead of fostering their own therapeutic skills for long-term recovery.
Objective:
This paper explores design considerations for adapting commonly-used LLMs (e.g., Gemini, ChatGPT, Llama) for clinical use, using them as a skill-building tool rather than a replacement for therapists.
Methods:
Guided by the educational theories, we developed a dual-persona chatbot. The first persona is a distressed character with cognitive distortion; the second persona is a facilitator that provides the user with scaffolding and instructions to navigate the interaction safely and successfully. Users are tasked to “help” the first persona, with the aid of the second persona, by identifying and restructuring their cognitive distortions. Through a process involving initial testing, establishing personas, and ensuring fidelity/safety, we developed three versions of the system. Four raters with varying clinical expertise assessed simulated interactions across four domains: Character Fidelity, Effective Facilitation, Boundary Management, and Overall Utility.
Results:
Inter-rater reliability among the raters was high (ICC = 0.76). The final version of the system was rated as effective in terms of character fidelity, learning facilitation, and clinical boundaries. The largest improvement across versions was in the construction of an effective and safe learning environment (F2,61 = 42.11, P <.001 for instruction clarity, F2,32 = 12.44, P <.001 for handling clinical risk), while character fidelity was rated highly across versions with little variation. The raters agreed that the tool is helpful for users to consolidate the skill of cognitive restructuring.
Conclusions:
By shifting the AI’s role from a source of emotional support to a subject for practice, this system encourages the user to engage in the practice to “be their own therapist”. Our findings provide a generalizable roadmap for integrating commercial AI into clinical workflows as a secure, skill-based supplement to human-led therapy.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.