Accepted for/Published in: JMIR Medical Education
Date Submitted: Aug 10, 2025
Date Accepted: Dec 13, 2025
AI-Driven OSCE Generation in Digital Health Education: Comparative Analysis of Three GPT-4o Configurations
ABSTRACT
Background:
Objective Structured Clinical Examinations (OSCE) are used as an evaluation method in medical education, but require significant pedagogical expertise and investment, especially in emerging fields like digital health. Large Language Models (LLMs), such as ChatGPT, have shown potential in automating educational content generation. However, OSCE generation using LLMs remains underexplored.
Objective:
This study evaluates three GPT-4o configurations for generating OSCE stations in digital health: (1) Standard GPT with a simple prompt and OSCE guidelines; (2) Personalized GPT with a simple prompt, OSCE guidelines, and a reference book in digital health; and (3) Simulated-Agents GPT with a structured prompt simulating specialized OSCE agents and the digital health reference book.
Methods:
Twenty-four OSCE stations were generated across 8 digital health topics with each GPT-4o configuration. Format compliance was evaluated by one expert, while educational content was assessed independently by two digital health experts, blindly of GPT-4o configurations, using a comprehensive assessment grid. Statistical analyses were performed using Kruskal-Wallis tests.
Results:
Simulated-Agents GPT performed best in format compliance and most content quality criteria, including accuracy (mean 4.47/5, P=.012), clarity (mean 4.46/5, P=.004). It also had 88% for usability without major revisions and first-place preference ranking, outperforming the other configurations. Personalized GPT showed the lowest format compliance, while Standard GPT scored lowest for clarity and educational value.
Conclusions:
Structured prompting strategies, particularly agents simulation, enhance the reliability and usability of LLM-generated OSCE content. These findings offer practical guidance for integrating artificial intelligence into medical education, while highlighting the continued need for expert validation.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.