JMIR Preprints #91019: Can GPT-5 Support Licensing Exam Preparation? Analysis of Accuracy, Reasoning, and Semantic Similarity Across Rehabilitation Disciplines

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Can GPT-5 Support Licensing Exam Preparation? Analysis of Accuracy, Reasoning, and Semantic Similarity Across Rehabilitation Disciplines

Christy Muasher-Kerwin;
M. Courtney Hughes;
Aida Sanatizadeh

ABSTRACT

Background:

As artificial intelligence tools become more common in health professional education, students are increasingly turning to large language models such as ChatGPT (GPT-5) to support studying for high-stakes licensing exams. Although these models can generate accurate factual responses, their ability to mirror expert reasoning and provide conceptually sound explanations remains uncertain. This study examined GPT-5’s accuracy, reasoning patterns, and semantic similarity to validated rehabilitation board-preparation in physical therapy, occupational therapy, and speech-language pathology.

Methods:

Three hundred multiple choice questions (100 per discipline) from verified board-preparation sources were entered into GPT-5 without hints or prompting. Model accuracy was recorded as correct or incorrect. The board preparation sources provided reasoning type (inductive, deductive, analytical, evaluative, inferential) per each question which was used to determine GPT-5 accuracy per reasoning type. Semantic similarity between GPT-5 and expert rationales were calculated using cosine similarity. Descriptive statistics summarized performance across disciplines. Incorrect responses underwent qualitative content analysis to identify shared conceptual challenges with dual coder review to establish agreement.

Results:

GPT-5 demonstrated high factual accuracy overall, with discipline specific variation: PT 91%, SLP 83%, and OT 78%. Deductive reasoning questions demonstrated the highest accuracy across disciplines, achieving 100% in PT. Mean semantic similarity between GPT-5 and expert rationales was 0.707, highest for deductive (0.712) and analytical (0.708) reasoning. Qualitative review indicated consistent issues with advanced reasoning tasks.

Conclusions:

GPT-5 reproduced substantial domain knowledge from rehabilitation board-preparation materials but showed persistent deficits in higher-order reasoning. Although semantic similarity to expert explanations was high, inconsistencies in inferential and evaluative logic limit its reliability as an unsupervised study tool. Findings highlight the need for guided use of LLMs in health-professions education, further research across specialties and exam formats, and clearer standards for integrating AI-based study aids to ensure educational quality and patient safety.

Citation

Please cite as:

Muasher-Kerwin C, Hughes MC, Sanatizadeh A

Can GPT-5 Support Licensing Examination Preparation? Analysis of Accuracy, Reasoning, and Semantic Similarity Across Rehabilitation Disciplines

JMIR Rehabil Assist Technol 2026;13:e91019

DOI: 10.2196/91019

PMID: 42166754

PMCID: 13193662

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Rehabilitation and Assistive Technologies

Date Submitted: Jan 7, 2026

Date Accepted: Apr 9, 2026

Can GPT-5 Support Licensing Exam Preparation? Analysis of Accuracy, Reasoning, and Semantic Similarity Across Rehabilitation Disciplines

ABSTRACT

Citation

Copyright