Accepted for/Published in: JMIR Medical Education
Date Submitted: May 30, 2025
Open Peer Review Period: May 30, 2025 - Jun 24, 2025
Date Accepted: Aug 14, 2025
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Development of a Clinical Clerkship Mentor Using Generative Artificial Intelligence and Evaluation of its Effectiveness in a Medical Student Trial Compared to Student Mentors
ABSTRACT
Background:
Medical students face multiple challenges related to acquiring clinical and communication skills, building professional relationships, and managing psychological stress at the beginning of their clinical clerkships (CCs). While mentoring and structured feedback are known to provide critical support, existing systems may not offer sufficient and timely guidance, owing to faculty’s limited availability. Generative artificial intelligence (gAI), particularly large language models, offers new opportunities to support medical education by providing context-sensitive responses.
Objective:
This study aimed to develop and evaluate a gAI CC mentor (AI-CCM) based on ChatGPT and evaluate its effectiveness in supporting medical students’ clinical learning, addressing their concerns, and supplementing human mentoring. The secondary objective was to compare AI-CCM’s educational value with responses from senior student mentors.
Methods:
We conducted two studies. In Study 1, we created 5 scenarios based on challenges students commonly encountered during CC. For each scenario, five senior student mentors and AI-CCM generated written advice. Five medical education experts evaluated these responses using a rubric to assess accuracy, practical utility, educational appropriateness (5-point Likert scale), and safety (binary scale). In Study 2, 17 fourth-year medical students used the AI-CCM for 1 week during their CC and completed a questionnaire evaluating its usefulness, clarity, emotional support, and impact on communication and learning (5-point Likert scale), informed by the Technology Acceptance Model.
Results:
All responses indicate that AI-CCM achieved higher scores than senior student mentors. AI-CCM responses were rated higher in educational appropriateness (4.2 ± 0.7 vs 3.8 ± 1.0, p = .001); no significant differences were observed in accuracy (4.4 ± 0.7 vs 4.2 ± 0.9, p = .111) or practical utility (4.1 ± 0.7 vs 4.0 ± 0.9, p = .347). No safety concerns were identified in AI-CCM responses, whereas two concerns were noted in student mentors’ responses. Scenario-specific analysis revealed that AI-CCM performed significantly better in emotional and psychological stress scenarios. In the student trial, AI-CCM was rated as moderately useful (mean usefulness 3.9 ± 1.1), with positive evaluations for clarity (4.0 ± 0.9) and emotional support (3.8 ± 1.1). However, aspects related to feedback guidance (2.9 ± 0.9) and anxiety reduction (3.2 ± 1.0) received more neutral ratings. Students primarily consulted AI-CCM regarding learning workload and communication difficulties; few students used it to address emotional stress-related issues.
Conclusions:
AI-CCM has potential as a supplementary educational partner during CC and offers comparable support to senior student mentors in structured scenarios. Despite challenges of response latency and limited depth in clinical content, AI-CCM was received well by, and accessible for, students who used ChatGPT’s free version. With further refinements, including specialty-specific content and improved responsiveness, AI-CCM may serve as a scalable context-sensitive support system in clinical medical education. Clinical Trial: None
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.