Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Nursing

Date Submitted: Jun 9, 2024
Date Accepted: Jan 2, 2025

The final, peer-reviewed published version of this preprint can be found here:

Advancing Clinical Chatbot Validation Using AI-Powered Evaluation With a New 3-Bot Evaluation System: Instrument Validation Study

Choo S, yoo s, Endo K, Truong B, Son MH

Advancing Clinical Chatbot Validation Using AI-Powered Evaluation With a New 3-Bot Evaluation System: Instrument Validation Study

JMIR Nursing 2025;8:e63058

DOI: 10.2196/63058

PMID: 40014000

PMCID: 11884306

Advancing Clinical Chatbot Validation using AI-Powered Evaluation with a New Three-Bot Evaluation System

  • Seungheon Choo; 
  • suyoung yoo; 
  • Kumiko Endo; 
  • Bao Truong; 
  • Meong Hi Son

ABSTRACT

Background:

The healthcare sector faces a projected shortfall of 10 million workers by 2030. AI automation in patient education and initial therapy screening presents a strategic response to mitigate this shortage and reallocate medical staff to higher-priority tasks.

Objective:

This study introduces a novel three-bot method for efficiently testing and validating early-stage AI healthcare provider chatbots. To extensively test AI provider chatbots without involving real patients or researchers, various AI patient bots and an evaluator bot were developed.

Methods:

Provider bots interacted with AI patient bots embodying frustrated, anxious, or depressed personas. An evaluator bot reviewed interaction transcripts based on specific criteria. Human experts then reviewed each interaction transcript, and the evaluator bot’s results were compared to human evaluation results to ensure accuracy.

Results:

The patient-education bot demonstrated high competency in delivering accurate medical information, easy-to-understand explanations, and empathy. The screening bot excelled in maintaining effective communication, building relationships, and exploring emotions. Statistical analysis confirmed the reliability and accuracy of the AI evaluations.

Conclusions:

The innovative evaluation method ensures a safe and effective means to test and refine early versions of healthcare provider chatbots without risking patient safety or excessive time and effort from researchers. This method allows for rapid testing and validation of healthcare chatbots to automate basic medical tasks.


 Citation

Please cite as:

Choo S, yoo s, Endo K, Truong B, Son MH

Advancing Clinical Chatbot Validation Using AI-Powered Evaluation With a New 3-Bot Evaluation System: Instrument Validation Study

JMIR Nursing 2025;8:e63058

DOI: 10.2196/63058

PMID: 40014000

PMCID: 11884306

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.