Accepted for/Published in: JMIR Medical Education
Date Submitted: Jul 26, 2023
Date Accepted: Nov 8, 2023
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
The Value and Risks of Generative AI in Mental Health & Substance Use Education and Prevention: Viewpoint
ABSTRACT
Background:
The use of Generative Artificial Intelligence, more specifically Large Language Models (LLMs), is proliferating and as such it is vital to consider both the value and risks of its use in health education. The efficiency in a variety of writing styles makes LLMs attractive for tailoring educational materials. However, this technology can feature biases and misinformation, which can be particularly harmful in medical education settings, such as mental health and substance use education. This viewpoint investigates if LLMs are sufficient for two common health education functions, namely users’ direct queries and as aids in the development of quality consumer educational health materials for the field of mental health and substance use.
Objective:
Insight into the accessibility, biases and quality of LLM produced query responses and educational health materials will enable us to provide guidance for the general public and health educators wishing to utilise GPT-4, the most common LLM among the general public.
Methods:
We collected real world queries and engineered a variety of prompts to use on GPT-4 Pro with the Bing BETA internet browsing plug-in. The outputs were evaluated with tools from the Sydney Health Literacy Lab to determine accessibility; the adherence to Mindframe communication guidelines to identify biases; tailoring to audiences, duty of care disclaimers, and evidence-based internet references were utilised to assess quality.
Results:
GPT-4’s outputs have good face-validity, but upon detailed analysis are substandard. Without engineered prompting, the reading level, adherence to communication guidelines, and use of evidence-based websites is poor. Therefore, all outputs still require caution, human editing and oversight.
Conclusions:
GPT-4 is currently not reliable enough for direct-consumer queries, but educators and researchers can utilise it for creating educational materials with caution. Materials created with LLMs should disclose the use of Generative AI and be evaluated on their efficacy with the target audience.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.