Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Jun 25, 2025
Date Accepted: Oct 2, 2025
Evaluation of an Artificial Intelligence Conversational Chatbot to Enhance HIV Pre-Exposure Prophylaxis Uptake: Development and Usability Internal Testing
ABSTRACT
Background:
The HIV epidemic in the United States disproportionately impacts gay, bisexual, and other men who have sex with men (MSM). Despite the effectiveness of HIV pre-exposure prophylaxis (PrEP) in preventing HIV acquisition, uptake among MSM remains suboptimal. Motivational interviewing (MI) has demonstrated efficacy at increasing PrEP uptake among MSM but is resource-intensive, limiting scalability. The use of artificial intelligence (AI), particularly large language models with conversational agents (i.e., “chatbots”) such as ChatGPT, may offer a scalable approach to delivering MI-based counseling for PrEP and HIV prevention.
Objective:
This study aimed to describe the development of an AI-based chatbot and evaluate its ability to provide MI-aligned education about PrEP and HIV prevention.
Methods:
The Chatbot for HIV Prevention and Action (CHIA) was built on a GPT-4o base model embedded with a validated knowledge database on HIV and PrEP in English and Spanish. CHIA was fine-tuned through training on a large MI dataset and prompt engineering. Use of the AutoGen multi-agent framework enabled CHIA to integrate two agents, the PrEP Counselor Agent and the Assistant Agent, which specialized in providing MI-based counseling and handling function calls (e.g., assessment of HIV risk), respectively. During internal testing from March 10-April 28, 2025, we systematically evaluated CHIA’s performance in English and Spanish using a set of five-point Likert scales to measure accuracy, conciseness, up-to-dateness, trustworthiness, and alignment with aspects of the MI spirit (e.g., collaboration, autonomy support) and MI-consistent behaviors (e.g., affirmation, open-ended questions). Descriptive statistics and independent samples t tests were used to analyze the data.
Results:
A total of 305 responses, including 140 English responses and 165 Spanish responses, were collected during the internal testing period. Overall, CHIA demonstrated strong performance across both languages, receiving the highest combined scores in the general response quality metrics including up-to-dateness (mean 4.6, SD 0.8), trustworthiness (mean 4.5, SD 0.9), accuracy (mean 4.4, SD 0.9), and conciseness (mean 4.2, SD 1.1). CHIA generally received higher combined scores for metrics that assessed alignment with the MI spirit (i.e. empathy, evocation, autonomy support, and collaboration) and lower combined scores for MI-consistent behaviors (i.e. affirmation, open-ended questions, and reflections). Spanish responses had significantly lower mean scores than English responses across nearly all MI-based metrics.
Conclusions:
These findings highlight the potential of AI-based chatbots including CHIA as a scalable tool for delivering MI-aligned counseling in English and Spanish to promote HIV prevention and PrEP uptake.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.