Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Apr 9, 2023
Date Accepted: Oct 3, 2023

The final, peer-reviewed published version of this preprint can be found here:

Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study

Yanagita Y, Yokokawa D, Uchida S, Tawara J, Ikusaka M

Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study

JMIR Form Res 2023;7:e48023

DOI: 10.2196/48023

PMID: 37831496

PMCID: 10612006

Can ChatGPT Answer Medical Questions of the National Medical Licensing Examination in Japan: Evaluation of Accuracy of ChatGPT

  • Yasutaka Yanagita; 
  • Daiki Yokokawa; 
  • Shun Uchida; 
  • Junsuke Tawara; 
  • Masatomi Ikusaka

ABSTRACT

ChatGPT (Open AI, San Francisco, California, USA) has gained considerable attention because of its natural and intuitive responses. One limitation of OpenAI is its failure to perform reinforcement learning based on reliable information, thereby providing inaccurate or meaningless answers. Fortunately, on March 2023 update introduced GPT-4, which, according to internal evaluations, is expected to increase the likelihood of producing factual responses by 40% compared with its predecessor, GPT-3.5. We verified the accuracy of ChatGPT based on GPT-4 (ChatGPT4) and based on GPT-3.5 (ChatGPT3.5) by solving the Japanese National Medical Examination. We excluded questions containing figures and tables unsupported by ChatGPT. Of the 400 questions, 292 were analyzed. The correct response rate for ChatGPT4 was 81.5%, which was significantly higher than 42.8%, the rate for ChatGPT3.5. Moreover, ChatGPT4 surpassed the passing standard (>72%) for the Japanese National Medical Examination, indicating its potential as a diagnostic and therapeutic decision aid for physicians. We anticipate that future updates of ChatGPT will further enhance its accuracy, making it an invaluable resource in the field of medicine.


 Citation

Please cite as:

Yanagita Y, Yokokawa D, Uchida S, Tawara J, Ikusaka M

Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study

JMIR Form Res 2023;7:e48023

DOI: 10.2196/48023

PMID: 37831496

PMCID: 10612006

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.