JMIR Preprints #71613: Patient triage and guidance in emergency departments using Large Language Models: Multimetric Assessment

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Patient triage and guidance in emergency departments using Large Language Models: Multimetric Assessment

Chenxu Wang;
Fei Wang;
Shuhan Li;
Qing-wen Ren;
Xiaomei Tan;
Yaoyu Fu;
Di Liu;
Guangwu Qian;
Yu Cao;
Rong Yin;
Kang Li

ABSTRACT

Background:

Emergency departments (EDs) face significant challenges due to overcrowding, prolonged waiting times, and staffing shortages, leading to increased strain on healthcare systems. Efficient triage systems and accurate departmental guidance are critical to alleviating these pressures. Recent advancements in Large Language Models (LLMs), such as ChatGPT, offer potential solutions for improving patient triage and outpatient department selection in emergency settings.

Objective:

The study aims to assess the accuracy, consistency, and feasibility of GPT-4 based ChatGPT models (GPT-4o and GPT-4-Turbo) for patient triage using the Modified Early Warning Score (MEWS) and evaluate GPT-4o’s ability to provide accurate outpatient department guidance based on simulated patient scenarios.

Methods:

A two-phase experimental study was conducted. In phase one, two ChatGPT models (GPT-4o and GPT-4-Turbo) were evaluated for MEWS-based patient triage accuracy using 1,854 simulated patient scenarios. Accuracy and consistency were assessed before and after prompt engineering. In phase two, GPT-4o was tested for outpatient department selection accuracy using 264 scenarios sourced from the Chinese Medical Case Repository. Each scenario was independently evaluated by GPT-4o three times. Data analyses included Wilcoxon tests, Kendall correlation coefficients, and logistic regression.

Results:

In the first phase, ChatGPT’s triage accuracy, based on the MEWS, improved following prompt engineering. Interestingly, GPT-4-Turbo outperformed GPT-4o, achieving an accuracy of 100% compared to GPT-4o's 96.2%, despite GPT-4o initially showing better performance prior to prompt engineering, suggesting GPT-4-Turbo may be more adaptable to prompt optimizations. In the second phase, GPT-4o, with a superior performance on emotional responsiveness compared to GPT-4-Turbo, demonstrated an overall guidance accuracy of 92.63% (95% CI, 90.34%, 94.93%), with the highest accuracy in internal medicine (93.51%, [95% CI, 90.85%, 96.17%]). and the lowest in general surgery (91.46%, [95% CI, 86.50%, 96.43%]).

Conclusions:

ChatGPT demonstrates promising capability for supporting patient triage and outpatient guidance in EDs. GPT-4-Turbo showed greater adaptability to prompt engineering, whereas GPT-4o exhibited superior responsiveness and emotional interaction, essential for patient-facing tasks. Future studies should explore real-world implementation and address identified limitations to enhance ChatGPT’s clinical integration.

Citation

Please cite as:

Wang C, Wang F, Li S, Ren Qw, Tan X, Fu Y, Liu D, Qian G, Cao Y, Yin R, Li K

Patient Triage and Guidance in Emergency Departments Using Large Language Models: Multimetric Study

J Med Internet Res 2025;27:e71613

DOI: 10.2196/71613

PMID: 40374171

PMCID: 12123234

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jan 22, 2025

Open Peer Review Period: Jan 22, 2025 - Feb 6, 2025

Date Accepted: May 1, 2025

(closed for review but you can still tweet)

Patient triage and guidance in emergency departments using Large Language Models: Multimetric Assessment

ABSTRACT

Citation

Copyright