JMIR Preprints #48002: Comparison of Performances of ChatGPT and GPT-4 in the Japanese National Medical Examination;

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Comparison of Performances of ChatGPT and GPT-4 in the Japanese National Medical Examination;

Soshi Takagi;
TAKASHI WATARI;
Ayano Erabi;
Kota Sakaguchi

ABSTRACT

Background:

ChatGPT’s competence in non-English languages is not well studied.

Objective:

Thus, this study compares the performance of ChatGPT and GPT-4 in the Japanese Medical Licensing Examination (JMLE) to evaluate the reliability of these models in clinical reasoning and medical knowledge in non-English languages.

Methods:

The study used the default mode of ChatGPT, based on GPT-3.5, the GPT-4 model of ChatGPT plus, and the 2022 JMLE, No. 117. A total of 254 questions were included in the final analysis, which were categorized into three types, namely general, clinical, and clinical sentence questions.

Results:

The results showed that GPT-4 outperformed ChatGPT in terms of accuracy, particularly for general clinical and clinical sentence questions. GPT-4 also performed better on difficult questions and specific disease questions. Furthermore, GPT-4 achieved the passing criteria for JMLE, indicating its reliability for clinical reasoning and medical knowledge in non-English languages.

Conclusions:

GPT-4 could become a valuable tool for medical education and clinical support in non-English-speaking regions, such as Japan.

Citation

Please cite as:

Takagi S, WATARI T, Erabi A, Sakaguchi K

Performance of GPT-3.5 and GPT-4 on the Japanese Medical Licensing Examination: Comparison Study

JMIR Med Educ 2023;9:e48002

DOI: 10.2196/48002

PMID: 37384388

PMCID: 10365615

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Education

Date Submitted: Apr 7, 2023

Open Peer Review Period: Apr 7, 2023 - Apr 24, 2023

Date Accepted: Jun 14, 2023

(closed for review but you can still tweet)

Comparison of Performances of ChatGPT and GPT-4 in the Japanese National Medical Examination;

ABSTRACT

Citation

Copyright