Accepted for/Published in: JMIR Mental Health
Date Submitted: Oct 18, 2023
Date Accepted: Apr 15, 2024
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Generation of Backward-Looking Complex Reflections for a Motivational Interviewing-Based Smoking Cessation Chatbot using GPT-4
ABSTRACT
Background:
Motivational Interviewing (MI) is a therapeutic technique that has been successful in helping smokers reduce smoking, but has limited accessibility due to the high cost and low availability of clinicians. To address this, the MIBot project has sought to develop a chatbot that emulates an MI session with a client with the specific goal of moving an ambivalent smoker towards the direction of quitting. One key element of an MI conversation is reflective listening, where a therapist expresses their understanding of what the client has said by uttering a reflection that encourages the client to continue their thought process. Complex reflections link the client’s responses to relevant ideas and facts to enhance this contemplation. Backward-looking complex reflections (BLCRs) link the client’s most recent response to a relevant selection of the client’s previous statements. Our current chatbot can generate complex reflections - but not BLCRs - using large language models (LLMs) such as GPT-2, which allows the generation of unique, human-like messages customized to client responses. Recent advances in these models, such as the introduction of GPT-4, provide a novel way to generate complex text by feeding the models instructions and conversational history directly, making this a promising approach to generate BLCRs.
Objective:
To develop a method to generate BLCRs for an MI-based smoking cessation chatbot, and to measure the method's effectiveness.
Methods:
Large Language Models such as GPT-4 can be stimulated to produce specific types of responses to their inputs by “asking” them with an English-based description of the desired output. These descriptions are called prompts, and the challenge of writing a description that allows LLMs to generate the optimal output is termed prompt engineering. We evolved an instruction to prompt GPT-4 to generate a BLCR given the prior transcript of the conversation up to the point where the reflection was needed. The approach was tested on 50 previously collected MIBot transcripts of conversations with smokers, and was used to generate a total of 150 reflections. The quality of the reflections was rated on a 4-point scale by three independent raters to determine if they met specific criteria for acceptability.
Results:
Of the 150 generated reflections, 132 (88%) of the reflections met the level of acceptability. The remaining 18 (12%) had one or more flaws that made them inappropriate BLCRs. The three raters had pairwise agreement on 80% to 88% of these scores.
Conclusions:
The method presented to generate BLCRs is good enough to be used as one source of reflections in an MI-style conversation, but would need an automatic checker to eliminate the unacceptable ones. This work illustrates the power of the new LLMs to generate therapeutic client-specific responses under the command of a language-based specification.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.