JMIR Preprints #63058: Advancing Clinical Chatbot Validation using AI-Powered Evaluation with a New Three-Bot Evaluation System

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Advancing Clinical Chatbot Validation using AI-Powered Evaluation with a New Three-Bot Evaluation System

Seungheon Choo;
suyoung yoo;
Kumiko Endo;
Bao Truong;
Meong Hi Son

ABSTRACT

Background:

The healthcare sector faces a projected shortfall of 10 million workers by 2030. AI automation in patient education and initial therapy screening presents a strategic response to mitigate this shortage and reallocate medical staff to higher-priority tasks.

Objective:

This study introduces a novel three-bot method for efficiently testing and validating early-stage AI healthcare provider chatbots. To extensively test AI provider chatbots without involving real patients or researchers, various AI patient bots and an evaluator bot were developed.

Methods:

Provider bots interacted with AI patient bots embodying frustrated, anxious, or depressed personas. An evaluator bot reviewed interaction transcripts based on specific criteria. Human experts then reviewed each interaction transcript, and the evaluator bot’s results were compared to human evaluation results to ensure accuracy.

Results:

The patient-education bot demonstrated high competency in delivering accurate medical information, easy-to-understand explanations, and empathy. The screening bot excelled in maintaining effective communication, building relationships, and exploring emotions. Statistical analysis confirmed the reliability and accuracy of the AI evaluations.

Conclusions:

The innovative evaluation method ensures a safe and effective means to test and refine early versions of healthcare provider chatbots without risking patient safety or excessive time and effort from researchers. This method allows for rapid testing and validation of healthcare chatbots to automate basic medical tasks.

Citation

Please cite as:

Choo S, yoo s, Endo K, Truong B, Son MH

Advancing Clinical Chatbot Validation Using AI-Powered Evaluation With a New 3-Bot Evaluation System: Instrument Validation Study

JMIR Nursing 2025;8:e63058

DOI: 10.2196/63058

PMID: 40014000

PMCID: 11884306

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Nursing

Date Submitted: Jun 9, 2024

Date Accepted: Jan 2, 2025

Advancing Clinical Chatbot Validation using AI-Powered Evaluation with a New Three-Bot Evaluation System

ABSTRACT

Citation

Copyright