Accepted for/Published in: JMIR Biomedical Engineering
Date Submitted: Oct 12, 2024
Date Accepted: Mar 25, 2025
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Evaluating the Performance of ChatGPT-4o in Classifying Knee X-rays for Osteoarthritis Detection: Challenges in Sensitivity and Specificity
ABSTRACT
Large language models have gained popularity in healthcare in multiple fields. One of these fields is radiology. Patients may use tools like Chat-GPT4o to scan their imaging to better understand their pathology. Clinicians may also use Chat-GPT4o to increase productivity and reduce human error. However, given this is a new technology, we do not know the diagnostic efficacy of Chat-GPT4o in the field of radiology. The aim of this study was to analyze the capability of Chat-GPT4o in properly identifying knee osteoarthritis. One thousand x-rays were given to Chat-GPT. Five hundred were normal knee x-rays, and the others were knees with osteoarthritis, vetted by radiologists. The x-rays were provided from an online publicly available database on Kaggle. Chat-GPT4o had good sensitivity but poor specificity in identifying knee osteoarthritis. It had a high level of false positives and poor precision. Overall, patients and clinicians should practice caution when using Chat-GPT4o to analyze imaging in knee osteoarthritis.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.