Accepted for/Published in: JMIR Medical Education
Date Submitted: Jan 28, 2024
Date Accepted: Dec 17, 2024
Performance of ChatGPT-3.5 and ChatGPT-4 in the Taiwan National Pharmacist Licensing Examination: A Comparative Evaluation Study
ABSTRACT
Background:
Open AI released version ChatGPT3.5 (Chat Generative Pre-Trained Transformer) and GPT-4 between 2022 and 2023. GPT3.5 has demonstrated proficiency in various examinations, especially in the United States Medical Licensing Examination. However, GPT-4 is even more advanced.
Objective:
This study wants to examine the efficacy of GPT-3.5 and GPT-4 within the Taiwan National Pharmacist Licensing Examination to ascertain their utility and potential in clinical pharmacy and education.
Methods:
The pharmacist examination in Taiwan consists of 2 stages: basic subjects and clinical subjects. In this study, exam questions were manually fed into the GPT-3.5 and GPT-4 models, and their responses recorded; graphic-based questions were excluded. This research encompassed the following: 1) determining the answering accuracy of GPT-3.5 and GPT-4, 2) to categorize question types and observing differences in model performance across these categories, and 3) comparing model performance in computational and situational questions. Microsoft Excel and R software were employed for statistical analyses.
Results:
GPT-4 yielded a 72.9% accuracy rate, overshadowing GPT-3.5’s 59.1%. In basic subjects, GPT-4 significantly outperformed GPT-3.5 (73.4% vs. 53.2%). However, in clinical subjects, only minor differences were observed between their accuracy. Specifically, GPT-4 outperformed GPT-3.5 in computational and situational questions.
Conclusions:
GPT-4 convincingly outperformed GPT-3.5 in the national pharmacist test, especially in foundational subjects. GPT4 thus has considerable potential in the clinical pharmacy and education settings; for example, it could provide drug interaction information and aid in consultations. Future research could integrate ChatGPT with medical databases, offering a platform for medical decision-making and thereby enhancing the quality of medical care.
Citation
Per the author's request the PDF is not available.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.