JMIR Preprints #57721: Large language models can match junior clinicians in discharge letter writing: a single-blinded study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Large language models can match junior clinicians in discharge letter writing: a single-blinded study

Joshua Yi Min Tung;
Sunil Ravinder Gill;
Gerald Gui Ren Sng;
Daniel Yan Zheng Lim;
Yuhe Ke;
Ting Fang Tan;
Liyuan Jin;
Kabilan Elangovan;
Jasmine Chiat Ling Ong;
Hairil Rizal Abdullah;
Daniel Shu Wei Ting;
Tsung Wen Chong

ABSTRACT

Background:

Discharge letters are a critical component in continuity of care between specialists and primary care providers, but are time-consuming to write, under-prioritized in comparison to direct clinical care, and are often tasked to junior doctors. Prior studies assessing the quality of discharge summaries written for inpatient hospital admissions show inadequacies in many domains. Large language models such as GPT have the ability to summarize large volumes of unstructured free text, such as electronic medical records, and have the potential to automate such tasks, providing time savings and consistency in quality.

Objective:

To assess the performance of GPT-4 in generating discharge letters written from Urology specialist outpatient clinics to primary care providers, and compare their quality against letters written by junior clinicians.

Methods:

Fictional electronic records were written by physicians, simulating five common Urology outpatient cases with long-term follow-up. Records comprised simulated consultation notes, referral letters and replies, and relevant discharge summaries from inpatient admissions. GPT-4 was tasked to write discharge letters for these cases, with a specified target audience of primary care providers who would be continuing the patient’s care. Prompts were written for safety, content, and style. Concurrently, junior clinicians were provided with the same case records and instructional prompts. GPT-4 output was assessed by the study team for instances of hallucination. A blinded panel of primary care physicians then evaluated the letters using a standardized questionnaire tool.

Results:

GPT-4 outperformed human counterparts in information provision, but was less concise. GPT-4 had no instances of hallucination. There were no statistical differences in the clarity, collegiality, follow-up recommendations, and overall satisfaction between letters generated by humans and by GPT-4.

Conclusions:

Discharge letters written by GPT-4 had equivalent quality to those written by junior clinicians, without any hallucinations. This study demonstrates proof of concept that LLMs can be useful and safe tools in clinical documentation.

Citation

Please cite as:

Tung JYM, Gill SR, Sng GGR, Lim DYZ, Ke Y, Tan TF, Jin L, Elangovan K, Ong JCL, Abdullah HR, Ting DSW, Chong TW

Comparison of the Quality of Discharge Letters Written by Large Language Models and Junior Clinicians: Single-Blinded Study

J Med Internet Res 2024;26:e57721

DOI: 10.2196/57721

PMID: 39047282

PMCID: 11306941

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Feb 25, 2024

Date Accepted: Jul 4, 2024

Large language models can match junior clinicians in discharge letter writing: a single-blinded study

ABSTRACT

Citation

Copyright