JMIR Preprints #48023: Can ChatGPT answer medical questions of the Japanese National Medical Examination: Comparison of accuracy of ChatGPT-3.5 and ChatGPT-4

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Can ChatGPT answer medical questions of the Japanese National Medical Examination: Comparison of accuracy of ChatGPT-3.5 and ChatGPT-4

Yasutaka Yanagita;
Daiki Yokokawa;
Shun Uchida;
Junsuke Tawara;
Masatomi Ikusaka

ABSTRACT

ChatGPT (Open AI, San Francisco, California, USA) has gained considerable attention because of its natural and intuitive responses. One limitation of OpenAI is its failure to perform reinforcement learning based on reliable information, thereby providing inaccurate or meaningless answers. Fortunately, on March 2023 update introduced GPT-4, which, according to internal evaluations, is expected to increase the likelihood of producing factual responses by 40% compared with its predecessor, GPT-3.5. We verified the accuracy of ChatGPT based on GPT-4 (ChatGPT4) and based on GPT-3.5 (ChatGPT3.5) by solving the Japanese National Medical Examination. We excluded questions containing figures and tables unsupported by ChatGPT. Of the 400 questions, 292 were analyzed. The correct response rate for ChatGPT4 was 81.5%, which was significantly higher than 42.8%, the rate for ChatGPT3.5. Moreover, ChatGPT4 surpassed the passing standard (>72%) for the Japanese National Medical Examination, indicating its potential as a diagnostic and therapeutic decision aid for physicians. We anticipate that future updates of ChatGPT will further enhance its accuracy, making it an invaluable resource in the field of medicine.

Citation

Please cite as:

Yanagita Y, Yokokawa D, Uchida S, Tawara J, Ikusaka M

Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study

JMIR Form Res 2023;7:e48023

DOI: 10.2196/48023

PMID: 37831496

PMCID: 10612006

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Formative Research

Date Submitted: Apr 9, 2023

Date Accepted: Oct 3, 2023

Can ChatGPT answer medical questions of the Japanese National Medical Examination: Comparison of accuracy of ChatGPT-3.5 and ChatGPT-4

ABSTRACT

Citation

Copyright