Accepted for/Published in: JMIR Medical Education
Date Submitted: May 20, 2023
Date Accepted: Oct 20, 2023
ChatGPT vs. Consultants: A Pilot Study on Answering Otorhinolaryngology Case-Based Questions
ABSTRACT
Background:
Large language models (LLMs), like ChatGPT, are increasingly utilized in medicine and supplement standard search engines as information sources. This leads to more "consultations" of LLMs about personal medical symptoms.
Objective:
This study aims to evaluate ChatGPT's performance in answering clinical case-based questions in otorhinolaryngology (ORL) in comparison to ORL consultants' answers.
Methods:
We used 41 case-based questions from established ORL study books and past German state examinations for doctors. The questions were answered by both ORL consultants and ChatGPT 3. ORL consultants rated all responses, except their own, on medical adequacy, conciseness, coherence, and comprehensibility using a 6-step Likert-scale. They also identified if the answer was created by an ORL consultant or ChatGPT. Additionally, the character count was compared.
Results:
Ratings in all categories were significantly higher for ORL consultants. Although scores were inferior to the ORL consultants, ChatGPT's scores were relatively higher in semantic categories (conciseness, coherence, and comprehensibility) compared to medical adequacy. ORL consultants identified ChatGPT as the source in over 95% of cases. ChatGPT's answers had a significantly higher character count compared to ORL consultants.
Conclusions:
While ChatGPT provided longer answers to medical problems, medical adequacy and conciseness were significantly lower compared to ORL consultants' answers. LLMs have potential as augmentative tools for medical care, but their "consultation" for medical problems carries a high risk of misinformation, as their high semantic quality may mask contextual deficits. Clinical Trial: Written correspondence of March 3rd 2023 with the ethics committee of the regional medical association Rhineland-Palatinate determined no need for any specific ethical approval due to the use of anonymous text based questions.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.