JMIR Preprints #59902: Performance Comparison of Junior Residents and ChatGPT in OSCE for Medical History Taking and Chart Writing: A Simulation-Based Evaluation

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Performance Comparison of Junior Residents and ChatGPT in OSCE for Medical History Taking and Chart Writing: A Simulation-Based Evaluation

Ting-Yun Huang;
Pei Hsing Hsieh;
Yung-Chun Chang

ABSTRACT

Background:

This study explores the cutting-edge abilities of large language models (LLMs) such as ChatGPT in medical history taking and medical chart documentation, with a focus on their practical effectiveness in clinical settings—an area vital for the progress of medical artificial intelligence.

Objective:

The aim was to assess the capability of ChatGPT versions 3.5 and 4.0 in performing medical history intakes and charting in simulated clinical environments. The study compared the performance of non-medical individuals using ChatGPT with that of junior medical residents.

Methods:

A simulation involving standardized patients was designed to mimic authentic medical history-taking interactions. Five non-medical participants utilized ChatGPT versions 3.5 and 4.0 to conduct medical histories and document charts, mirroring the tasks performed by five junior residents in identical scenarios. A total of ten diverse scenarios were examined.

Results:

Evaluation of the medical documentation created by laypersons with ChatGPT assistance and by junior residents was conducted by two senior emergency physicians, employing audio recordings and the final charts. The assessment used the Objective Structured Clinical Examination (OSCE) benchmarks in Taiwan as a reference. ChatGPT 4.0 exhibited substantial enhancements over its predecessor and met or exceeded the performance of human counterparts in terms of both checklist and global assessment scores. Although the overall quality of human consultations remained higher, ChatGPT 4.0's proficiency in medical documentation was notably promising.

Conclusions:

The performance of ChatGPT 4.0 was on par with human participants in OSCE evaluations, signifying its potential in medical history documentation and chart writing. Despite this, the superiority of human consultations in terms of quality was evident. The study underscores both the promise and the current limitations of LLMs in the realm of clinical practice.

Citation

Please cite as:

Huang TY, Hsieh PH, Chang YC

Performance Comparison of Junior Residents and ChatGPT in the Objective Structured Clinical Examination (OSCE) for Medical History Taking and Documentation of Medical Records: Development and Usability Study

JMIR Med Educ 2024;10:e59902

DOI: 10.2196/59902

PMID: 39622713

PMCID: 11612517

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Education

Date Submitted: Apr 25, 2024

Open Peer Review Period: Apr 25, 2024 - Jun 20, 2024

Date Accepted: Sep 23, 2024

(closed for review but you can still tweet)

Performance Comparison of Junior Residents and ChatGPT in OSCE for Medical History Taking and Chart Writing: A Simulation-Based Evaluation

ABSTRACT

Citation

Copyright