JMIR Preprints #68409: Assessing ChatGPT’s Clinical Competency and Patient Perceptions in Emergency Medicine: Insights from Clinical Performance Examinations

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Assessing ChatGPT’s Clinical Competency and Patient Perceptions in Emergency Medicine: Insights from Clinical Performance Examinations

ChulHyoung Park;
Min Ho An;
Gyubeom Hwang;
Rae Woong Park;
Juho An

ABSTRACT

Background:

Emergency medicine can benefit from AI due to its unique challenges, such as high patient volume and the need for urgent interventions. However, it remains difficult to assess the applicability of AI systems to real-world emergency medicine practice, which requires not only medical knowledge but also adaptable problem-solving and effective communication skills.

Objective:

We aimed to evaluate ChatGPT's performance in comparison to human doctors in simulated emergency medicine settings, utilizing the framework of Clinical Performance Examination (CPX).

Methods:

Twenty-eight text-based cases and four image-based cases relevant to emergency medicine were selected. Twelve human doctors were recruited to represent the medical professionals. Both ChatGPT and the human doctors were instructed to manage each case like real clinical settings with simulated patients. After the CPX sessions, the conversation records were evaluated by an emergency medicine professor on history taking, clinical accuracy, and empathy on a 5-point Likert scale. Simulated patients completed a 5-point scale survey including overall comprehensibility, credibility, concern reduction for each case. Additionally, they evaluated whether the doctor they interacted with was similar to a human doctor. The mean scores from ChatGPT were then compared to those of the human doctors.

Results:

ChatGPT scored significantly higher than the physicians in both history-taking (mean score 3.91 [SD 0.67] vs. 2.67 [SD 0.78], P < 0.01) and empathy (mean score 4.50 [SD 0.67] vs. 1.75 [SD 0.62], P < 0.01). However, there was no significant difference in clinical accuracy. In the survey conducted with simulated patients, ChatGPT scored higher for concern reduction (mean score 4.33 [SD 0.78] vs. 3.58 [SD 0.90], P = 0.04). For comprehensibility and credibility, ChatGPT showed better performance, but the difference was not significant. In the similarity assessment score, no significant difference was observed (mean score 3.50 [SD 1.78] vs. 3.25 [SD 1.86], P = 0.71).

Conclusions:

ChatGPT’s performance highlights its potential as a valuable adjunct in emergency medicine, demonstrating comparable proficiency in knowledge application, efficiency, and empathetic patient interaction. These results suggest that a collaborative healthcare model, integrating AI with human expertise, could enhance patient care and outcomes.

Citation

Please cite as:

Park C, An MH, Hwang G, Park RW, An J

Clinical Performance and Communication Skills of ChatGPT Versus Physicians in Emergency Medicine: Simulated Patient Study

JMIR Med Inform 2025;13:e68409

DOI: 10.2196/68409

PMID: 40674718

PMCID: 12289221

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Nov 6, 2024

Date Accepted: May 23, 2025

Assessing ChatGPT’s Clinical Competency and Patient Perceptions in Emergency Medicine: Insights from Clinical Performance Examinations

ABSTRACT

Citation

Copyright