Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Sep 25, 2024
Open Peer Review Period: Sep 26, 2024 - Nov 21, 2024
Date Accepted: Nov 30, 2024
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Assessing the Reliability of ChatGPT Chatbots for Smoking Cessation: A Content Analysis
ABSTRACT
Background:
Large Language Model (LLM) AI chatbots using generative language can offer smoking cessation information and advice. However, little is known about the reliability of information provided to users.
Objective:
This study aims to examine whether 3 ChatGPT chatbots – the World Health Organization’s (WHO) Sarah, BeFreeGPT, and BasicGPT – provide reliable information on how to quit smoking.
Methods:
A list of quit smoking queries was generated from frequent quit smoking searches on Google related to “how to quit smoking” (N=12). Each query was given to each chatbot, and responses were analyzed for their adherences to an index developed from the United States Preventive Services Task Force (USPSTF) public health guidelines for quitting smoking and counseling principles. Responses were independently coded by 2 reviewers and differences resolved by a third coder.
Results:
Across chatbots and queries, chatbot responses were rated as being adherent to 57.1% of the items on the adherence index. Sarah’s adherence (72.2%) was significantly higher than BeFreeGPT (50.0%) and BasicGPT (47.8%) (p<.01). The majority of Chatbot responses had clear language (97.3%) and included a recommendation to seek out professional counseling (80.3%). About half of responses included the recommendation to consider using nicotine replacement therapy (NRT) (52.7%), the recommendation to seek out social support from friends and family (55.6%), and information on how to deal with cravings when quitting smoking (44.4%). Least common was information about considering the use of non-NRT prescription drugs (14.1%). Finally, some type of misinformation was present in 22.0% of responses. Specific queries that were most challenging for the chatbots included queries on “how to quit smoking cold turkey,” “… with vapes,” “…with gummies,” “…with a necklace,” and “…with hypnosis.” All chatbots showed resilience to adversarial attacks that were intended to derail the conversation.
Conclusions:
LLM chatbots varied in their adherence to quit smoking guidelines and counseling principles. While chatbots reliably provided some types of information, they omitted other types, as well as occasionally provided misinformation, especially for queries about less evidence-based methods of quitting. LLM chatbot instructions can be revised to compensate for these weaknesses. Clinical Trial: n/a
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.