Accepted for/Published in: JMIR AI
Date Submitted: Dec 9, 2024
Open Peer Review Period: Dec 9, 2024 - Feb 3, 2025
Date Accepted: Jan 30, 2025
Date Submitted to PubMed: Apr 10, 2025
(closed for review but you can still tweet)
Prompt Engineering an Informational Chatbot for Educating about Mental Health: Utilizing a Multi-Agent Approach for Enhanced Compliance with Prompt Instruction
ABSTRACT
Background:
Patients with schizophrenia often present with cognitive impairments that may hinder their ability to learn about their condition. Education platforms powered by Large Language Models (LLMs) have the potential to improve accessibility of mental health information. However, the black-box nature of LLMs raises ethical and safety concerns regarding the unpredictability of LLM-based agents. In particular, prompt-engineered chatbots may drift from their intended role as the conversation progresses and become more prone to hallucinations.
Objective:
To develop and evaluate a Critical Analysis Filter (CAF) that ensures that LLM-powered conversational agents reliably complies with their instructions and scope while delivering validated mental health information.
Methods:
For a proof-of-concept, we prompt-engineered an educational schizophrenia chatbot powered by GPT-4 that can dynamically access information from a schizophrenia manual written primarily for patients. In the CAF, a team of prompt-engineered LLM agents are utilized to critically analyze and refine the chatbot's responses and provide it with real-time self-reflective feedback. To assess the ability of the CAF to re-establish the chatbot's adherence to its instructions, we generate three conversations (by conversing with the chatbot with the CAF disabled) wherein the chatbot starts to drift from its instructions towards various unintended roles. We use these checkpoint conversations to initialize automated conversations between the chatbot and adversarial chatbots designed to entice it towards unintended roles. Conversations were repeatedly sampled with the CAF enabled and disabled respectively. Three human raters independently rated each chatbot response according to criteria developed to measure the chatbot's integrity; specifically, its transparency (such as admitting when a statement lacks explicit support from its scripted sources) and its tendency to faithfully convey the scripted information in the schizophrenia manual.
Results:
36 responses (3 different checkpoint conversations, 3 conversations per checkpoint, 4 adversarial queries per conversation) were scored with the CAF enabled and disabled respectively, totalling 72 evaluated responses overall. Activating the CAF resulted in a compliance score that was considered acceptable (≥2) in 67.0% of responses, compared to only 8.7% when the CAF was deactivated.
Conclusions:
Although more extensive testing in realistic scenarios is needed, our results suggest self-reflection mechanisms could enable LLMs to be used effectively and safely in educational mental health platforms. This approach harnesses the flexibility of LLMs while reliably constraining their scope to appropriate and accurate interactions.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.