Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: Journal of Medical Internet Research

Date Submitted: May 20, 2026
Open Peer Review Period: May 21, 2026 - Jul 16, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Large Language Models Produce Patient-Friendly Translations of Outpatient Reports at Expert Quality via Few-Shot Prompting: A Blinded Comparative Evaluation

  • Timo Elias Reimann; 
  • Vanessa Haug; 
  • Filippo Maria Verri; 
  • Christoph Leinert; 
  • Tim Fleiner; 
  • Michael Denkinger; 
  • Thomas Derya Kocar

ABSTRACT

Background:

Effective communication between clinicians and patients plays a significant role in the successful delivery of healthcare. In geriatric medicine, where multimorbidity, polypharmacy, and functional limitations often make treatment plans complex, artificial intelligence (AI) translations of clinical reports into patient-friendly language offer a promising way to improve communication.

Objective:

This study investigates whether AI can generate patient-friendly translations of outpatient reports at expert quality.

Methods:

We used Gemma-3-27b-it, an instruction fine-tuned open-source large language model (LLM) by Google LLC (California, USA), and applied few-shot prompting to enable the generation of patient-friendly translations of 64 anonymized geriatric outpatient reports, written in German, while preserving clinical meaning. LLM-generated translations were compared with those produced by two geriatric experts in a blinded evaluation. Quality was assessed using three approaches: (1) embedding-based semantic similarity to the original report, (2) blinded expert review by two clinicians, and (3) review by patient representatives, with the latter two utilizing a standardized Likert-scale questionnaire.

Results:

AI-generated translations demonstrated statistical equivalence to geriatrician-authored simplifications regarding semantic similarity (P=.018) and were preferred by patient representatives (P<.001) as well as one expert reviewer (P=.001). Expert reviewers perceived physician-authored simplifications as more complete (P<.001), a finding that correlated with text length (P<.001).

Conclusions:

This study demonstrates the feasibility of utilizing LLMs via few-shot prompting to simplify medical reports for improved patient comprehension in geriatric care. Future prospective studies should examine whether such AI-driven translations positively impact health-related outcomes, adherence, and patient engagement.


 Citation

Please cite as:

Reimann TE, Haug V, Verri FM, Leinert C, Fleiner T, Denkinger M, Kocar TD

Large Language Models Produce Patient-Friendly Translations of Outpatient Reports at Expert Quality via Few-Shot Prompting: A Blinded Comparative Evaluation

JMIR Preprints. 20/05/2026:101882

DOI: 10.2196/preprints.101882

URL: https://preprints.jmir.org/preprint/101882

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.