JMIR Preprints #82971: Large Language Models for Cancer Communication: Evaluating Linguistic Quality, Safety, and Accessibility in Generative AI

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Large Language Models for Cancer Communication: Evaluating Linguistic Quality, Safety, and Accessibility in Generative AI

Agnik Saha;
Victoria Churchill;
Anny D. Rodriguez;
Ugur Kursuncu;
Muhammed Y. Idris

ABSTRACT

Background:

Effective communication about breast and cervical cancers remains a public health challenge, with widespread misinformation and barriers to cancer-related language understanding. Large Language Models (LLMs) offer potential for scalable health communication, yet tradeoffs between quality, safety, and accessibility of general-purpose and medical-domain LLMs remain underexplored.

Objective:

We propose a comprehensive evaluation framework and systematically assesses the performance of LLMs in generating breast and cervical cancer information, with a focus on linguistic quality, safety and trustworthiness, and communication accessibility and affectiveness

Methods:

This mixed-methods evaluation study assessed outputs from five general-purpose and three medical large language models (LLMs) using real-world breast and cervical cancer–related questions curated from publicly available medical datasets. LLM-generated responses were evaluated in a controlled offline setting. Primary outcomes included linguistic quality (fluency, coherence, accuracy), safety and trustworthiness (toxicity, bias, harm potential), and communication accessibility and affectiveness (readability, empathy, clarity). Qualitative ratings were performed by domain experts, while quantitative metrics were compared across models. Statistical analyses included Welch’s ANOVA to detect differences in metric scores, Games-Howell tests for pairwise comparisons, and Hedges’ g to assess effect sizes.

Results:

General-purpose LLMs, particularly Llama 3 and Gemma, demonstrated superior linguistic quality and affectiveness but often produced complex outputs that may limit accessibility. In contrast, medical LLMs (e.g., MedAlpaca, BioMistral) generated simpler content suitable for broader audiences but scored lower in safety and empathy due to higher levels of hallucination, bias, and toxicity.

Conclusions:

While LLMs show promise for improving digital cancer communication, our findings reveal a trade-off between domain specialization and overall communication quality and safety. Future development of health-focused LLMs should prioritize hybrid modeling strategies to enhance trust, clarity, and clinical relevance in patient-facing tools. Clinical Trial: Not applicable

Citation

Please cite as:

Saha A, Churchill V, Rodriguez AD, Kursuncu U, Idris MY

Large Language Models for Breast and Cervical Cancers Communication: Mixed Methods Evaluation Study Assessing Linguistic Quality, Safety, and Accessibility

JMIR Cancer 2026;12:e82971

DOI: 10.2196/82971

PMID: 42361270

PMCID: 13308906

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Cancer

Date Submitted: Aug 25, 2025

Date Accepted: Dec 22, 2025

Large Language Models for Cancer Communication: Evaluating Linguistic Quality, Safety, and Accessibility in Generative AI

ABSTRACT

Citation

Copyright