Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Sep 19, 2025
Open Peer Review Period: Sep 19, 2025 - Nov 14, 2025
Date Accepted: Feb 27, 2026
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Large Language Model–Generated Patient Instructions for Prescriptions in Primary Health Care: Preclinical Algorithm Validation

Silveira Nogueira Reis Z, Tuler Albergaria E, Silvina Pagano A, Martins Lage E, Ribeiro de Oliveira F, dos Santos Dias C, Almeida Oliveira J, Miranda Varella Pereira G, Jose Ramos de Oliveira I, Franco Mineiro Ã, Carvalho Lima Oliveira I, dos Reis de Jesus D, Pereira de Souza Júnior A, de Carvalho Gomes I, André Cuevas Gaete R, Cruz-Correia R, Chaves Dutra da Rocha L

Large Language Model–Generated Patient Instructions for Prescriptions in Primary Health Care: Preclinical Algorithm Validation

J Med Internet Res 2026;28:e84444

DOI: 10.2196/84444

PMID: 42190235

Large Language Model-Generated Patient Instructions for Prescriptions in Primary Health Care: A Preclinical Evaluation

  • Zilma Silveira Nogueira Reis; 
  • Elisa Tuler Albergaria; 
  • Adriana Silvina Pagano; 
  • Eura Martins Lage; 
  • Flávia Ribeiro de Oliveira; 
  • Cristiane dos Santos Dias; 
  • Juliana Almeida Oliveira; 
  • Gláucia Miranda Varella Pereira; 
  • Isaias Jose Ramos de Oliveira; 
  • Érico Franco Mineiro; 
  • Igor Carvalho Lima Oliveira; 
  • Davi dos Reis de Jesus; 
  • Antônio Pereira de Souza Júnior; 
  • Igor de Carvalho Gomes; 
  • Rodrigo André Cuevas Gaete; 
  • Ricardo Cruz-Correia; 
  • Leonardo Chaves Dutra da Rocha

ABSTRACT

Background:

Large Language Model-Generated Patient Instructions for Prescriptions in Primary Health Care: A Preclinical Evaluation

Objective:

We evaluated Large Language Models (LLMs) performance in generating medication usage instructions to complement prescriptions in Primary Health Care.

Methods:

This randomized, blinded experimental study utilized prescription-inducing scenarios, assigned to 62 healthcare professionals, to validate instructions generated by LLMs during e-prescriptions. The instructions were generated by ChatGPT-4.0, Llama3.1-8B, and Llama3.1-8B-RAG using Retrieval-Augmented Generation (RAG) based on patient information leaflets. Performance metrics assessed Adequacy, Completeness, Clarity, Personalization, Usefulness, and errors in the generated instructions, with scores to analyse overall and individual metrics, using all evaluations (n=198) and consensus among evaluators by test (n=46).

Results:

The three models yielded similar scores for producing qualified instructions, by consensus among evaluators (n=46 tests), with median (IQR) values of: ChatGPT-4.0: 89.3 (12.5), Llama3.1-8B: 79.5 (46.1), and Llama3.1-8B-RAG: 85.7 (21.9), P=.282. RAG rendered Llama3.1-8B model equivalent to ChatGPT-4.0 regarding Adequacy, Completeness, Clarity, and Usefulness, and presented fewer errors in the generated instructions: ChatGPT-4.0 (n=5), Llama3.1-8B (n=11), and Llama3.1-8B-RAG (n=4), P=.040. Concerning specific criteria across 198 tests, Llama3.1-8B-RAG received scores equivalent to those of ChatGPT-4.0 in Adequacy with mean (SD) 6.24 (2.3) and 6.82 (2.1), respectively, P=.536); Completeness with mean (SD) 5.94 (2.2) and 6.55 (1.8), respectively, P=.376; Clarity with mean (SD) 5.77 (2.4) and 6.68 (1.9), respectively, P=.086; as well as Usefulness with mean (SD) 5.42 (2.4) and 5.96 (2.2), respectively, P=.627. ChatGPT-4.0 received higher scores in the Personalization criterion with mean (SD) 7.05 (1.5) in comparison with 5.44 (2.6) Llama3.1-8B-RAG, P<.001.

Conclusions:

The open-source LLM enhanced with external information presenting similar performance to the closed-source model. LLM-generation demonstrated potential for instructing patients on medication use. Nonetheless, the introduction of this innovation into the e-prescribing workflow demands prescriber validation and LLM performance governance.


 Citation

Please cite as:

Silveira Nogueira Reis Z, Tuler Albergaria E, Silvina Pagano A, Martins Lage E, Ribeiro de Oliveira F, dos Santos Dias C, Almeida Oliveira J, Miranda Varella Pereira G, Jose Ramos de Oliveira I, Franco Mineiro Ã, Carvalho Lima Oliveira I, dos Reis de Jesus D, Pereira de Souza Júnior A, de Carvalho Gomes I, André Cuevas Gaete R, Cruz-Correia R, Chaves Dutra da Rocha L

Large Language Model–Generated Patient Instructions for Prescriptions in Primary Health Care: Preclinical Algorithm Validation

J Med Internet Res 2026;28:e84444

DOI: 10.2196/84444

PMID: 42190235

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.