Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Cancer

Date Submitted: Aug 27, 2024
Date Accepted: Mar 21, 2025

The final, peer-reviewed published version of this preprint can be found here:

Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: Prompt Engineering Project

Coen E, Del Fiol G, Kaphingst KA, Borsato E, Shannon J, Stevens Smith H, Masino A, Allen CG

Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: Prompt Engineering Project

JMIR Cancer 2025;11:e65848

DOI: 10.2196/65848

PMID: 40493514

PMCID: 12172806

Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: A Prompt Engineering Project

  • Emma Coen; 
  • Guilherme Del Fiol; 
  • Kimberly A. Kaphingst; 
  • Emerson Borsato; 
  • Jackie Shannon; 
  • Hadley Stevens Smith; 
  • Aaron Masino; 
  • Caitlin G. Allen

ABSTRACT

Background:

The increasing demand for population-wide genomic screening (PGS) and the limited availability of genetic counseling resources have created a pressing need for innovative service delivery models. Chatbots powered by large language models (LLMs) have shown potential in genomic services, particularly in pre-test counseling, but their application in returning positive PGS results remains underexplored. Leveraging advanced LLMs like GPT-4 offers an opportunity to address this gap by delivering accurate, contextual, and user-centered communication to individuals receiving positive genetic test results.

Objective:

A three-step prompt engineering process using Retrieval-Augmented Generation (RAG) and few-shot techniques was employed to create the chatbot. Training materials included patient frequently asked questions, genetic counseling scripts, and patient-derived queries. The chatbot underwent iterative refinement based on 13 training questions, while performance was evaluated through expert ratings on responses to two hypothetical patient scenarios. The two scenarios were intended to represent common but distinct patient profiles in terms of gender, race, ethnicity, age, and background knowledge. Domain experts rated the chatbot using a 5-point Likert scale across eight predefined criteria: tone, clarity, program accuracy, domain accuracy, robustness, efficiency, boundaries, and usability.

Methods:

We used a three-step prompt engineering process, including Retrieval-Augmented Generation (RAG) and few-shot techniques to develop an open-response chatbot. This was then evaluated using two hypothetical scenarios, with experts rating its performance using a 5-point Likert scale across eight criteria: tone, clarity, program accuracy, domain accuracy, robustness, efficiency, boundaries, and usability.

Results:

This study demonstrates the feasibility of using LLM-powered chatbots to support the return of positive genomic screening results. The chatbot effectively handled open-ended patient queries, maintained conversational boundaries, and delivered user-friendly responses. However, enhancements in program-specific accuracy are essential to maximize its utility. Future research will explore hybrid chatbot designs that combine the strengths of LLMs with rule-based components to improve scalability, accuracy, and accessibility in genomic service delivery. The findings underscore the potential of generative AI tools to address resource limitations and improve the accessibility of genomic healthcare services.

Conclusions:

This study demonstrates the feasibility of using LLM-powered chatbots to support the return of positive genomic screening results. The chatbot effectively handled open-ended patient queries, maintained conversational boundaries, and delivered user-friendly responses. However, enhancements in program-specific accuracy are essential to maximize its utility. Future research will explore hybrid chatbot designs that combine the strengths of LLMs with rule-based components to improve scalability, accuracy, and accessibility in genomic service delivery. The findings underscore the potential of generative AI tools to address resource limitations and improve the accessibility of genomic healthcare services.


 Citation

Please cite as:

Coen E, Del Fiol G, Kaphingst KA, Borsato E, Shannon J, Stevens Smith H, Masino A, Allen CG

Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: Prompt Engineering Project

JMIR Cancer 2025;11:e65848

DOI: 10.2196/65848

PMID: 40493514

PMCID: 12172806

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.