JMIR Preprints #70176: Development of AI Chatbots for Cancer Information: Reducing Hallucinations and Trade-Offs in Responses with Reliable Data

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Development of AI Chatbots for Cancer Information: Reducing Hallucinations and Trade-Offs in Responses with Reliable Data

Sota Nishisako;
Takahiro Higashi;
Fumihiko Wakao

ABSTRACT

Background:

Generative artificial intelligence (AI) is increasingly used to find information. Providing accurate information is essential to support cancer patients and their families; however, information returned by generative AIs is sometimes wrong. Returning wrong information is called hallucination.

Objective:

We aimed to examine cancer information returned by generative AIs with retrieval-augmented generation (RAG) using cancer-specific information sources and general internet search.

Methods:

We compiled 62 cancer-related questions in Japanese and compared the responses of conventional chatbots with GPT-4 and GPT-3.5 (-turbo-16K) without RAG. We developed generative AI chatbots with different reference information sources—RAG-equipped Cancer Information Service (CIS) chatbot and Google chatbot—and compared the characteristics of their responses with those generated by a conventional chatbot without RAG. The CIS chatbot system included CIS as the reference information source. The characteristics of the responses were analyzed.

Results:

For questions on information issued by CIS, the rates of hallucinations for the CIS chatbot were 0% for GPT-4 and 6% for GPT-3.5, whereas those for the Google chatbot were 6% and 10%. For questions on information that is not issued by CIS, the Google chatbot generated hallucinations in 19% of cases using GPT-4 and 35% using GPT-3.5. The conventional chatbot returned hallucinations in approximately 40% of the responses. The reference data from Google searches was higher compared to CIS for producing hallucinations, with an odds ratio of 9.4, (95% confidence interval 1.2-17.5, P < .01), and the odd ratio for the conventional chatbot was 16.1 (95% CI, 3.7-50.0, P < .001).　The conventional chatbot responded to all questions, but the response rate decreased (36% to 81%) for chatbots with RAG. For questions on information not covered by CIS, the CIS chatbot did not respond, while the Google chatbot generated responses in 52% of the cases using GPT-4 and 71% using GPT-3.5.

Conclusions:

Using RAG with reliable information sources significantly reduced the hallucination rate of generative AI chatbots, and increased the ability to admit lack of information, making them more suitable for general use, where users need to be provided with accurate information.

Citation

Please cite as:

Nishisako S, Higashi T, Wakao F

Reducing Hallucinations and Trade-Offs in Responses in Generative AI Chatbots for Cancer Information: Development and Evaluation Study

JMIR Cancer 2025;11:e70176

DOI: 10.2196/70176

PMID: 40934488

PMCID: 12425422

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Cancer

Date Submitted: Dec 17, 2024

Date Accepted: Jul 7, 2025

Development of AI Chatbots for Cancer Information: Reducing Hallucinations and Trade-Offs in Responses with Reliable Data

ABSTRACT

Citation

Copyright