Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Education

Date Submitted: Dec 26, 2025
Date Accepted: Feb 25, 2026

The final, peer-reviewed published version of this preprint can be found here:

AI-generated Feedback Following Social Robotic Virtual Patient Interactions and Medical Student Performance: Nonrandomized Quasi-Experimental Study

Borg A, Schiött J, Ivegren W, Gentline C, Huss V, Hugelius AM, Jobs B, Ruiz M, Edelbring S, Georg C, Skantze G, Parodis I

AI-generated Feedback Following Social Robotic Virtual Patient Interactions and Medical Student Performance: Nonrandomized Quasi-Experimental Study

JMIR Med Educ 2026;12:e90368

DOI: 10.2196/90368

PMID: 41881044

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

AI-generated feedback in social robotic virtual patients and medical student performance: A nonrandomized clinical trial

  • Alexander Borg; 
  • Jonathan Schiött; 
  • William Ivegren; 
  • Cidem Gentline; 
  • Viking Huss; 
  • Anna Margareta Hugelius; 
  • Benjamin Jobs; 
  • Mini Ruiz; 
  • Samuel Edelbring; 
  • Carina Georg; 
  • Gabriel Skantze; 
  • Ioannis Parodis

ABSTRACT

Background:

Virtual patients (VPs) demonstrate effectiveness in improving clinical reasoning (CR) skills, yet traditional VP platforms often lack individualised feedback mechanisms. Effective feedback represents a cornerstone of medical education, particularly in developing CR skills. Advances in large language models (LLMs) enable automated analysis of student-VP interactions, providing scalable feedback on clinical performance. While AI-enhanced social robotic VPs show promise for CR training, no studies have examined whether integrated AI-generated feedback improves clinical performance objectively. Determining whether AI-generated feedback translates into improved performance in clinical examinations could provide essential evidence for the educational value of these emerging technologies.

Objective:

To evaluate whether the integration of AI-generated post-consultation feedback into social robotic VP interactions improves medical students’ clinical performance.

Methods:

A quasi-experimental study with 115 sixth-semester medical students (73.2% of eligible students) was conducted at Karolinska Institutet, Stockholm, Sweden, during spring 2025. Sixth semester medical students were allocated to either receive (n=61) or not receive (n=54) AI-generated feedback following interactions with a Social AI-enhanced Robotic Interface (SARI). All students completed nine VP cases; students in the intervention group received approximately one page of structured written feedback after each VP case using SARI. The AI feedback system employed multiple LLMs and followed a two-stage algorithm: first assessing student-VP dialogues using an assessment rubric, then generating structured feedback on medical history-taking performance. Students in both groups participated in case-specific follow-up seminars led by consultant rheumatologists following each VP encounter. Clinical performance was assessed through an eight-minute OSCE-like evaluation with a standardised patient portraying axial spondylarthritis, evaluated by a consultant rheumatologist blinded to group allocation using a 10-point rubric across five domains: communication at consultation start, generic medical history, targeted medical history, diagnostics and management reasoning, and communication at consultation end.

Results:

Students receiving AI-generated feedback achieved significantly higher total OSCE scores (7.39±0.86 versus 6.68±1.04 points; mean difference: 0.70; 95% CI: 0.35–1.06; P<.001; Cohen's d=0.74). Domain-specific analysis revealed significant improvement in generic medical history after Bonferroni correction (2.46±0.65 versus 2.03±0.79 points; P=.004; r=0.27), while other domains showed no significant differences: communication at start (P=.134; r=0.14), targeted medical history (P=.605; r=0.05), diagnostics and management (P=.149; r=0.14), and communication at consultation end (P=.312; r=0.09). Pass rates were significantly higher in the feedback group (96.7% versus 79.6%; OR: 7.55; 95% CI: 1.51–72.2; P=.006), with a number needed to treat of six students.

Conclusions:

AI-generated feedback following social robotic VP interactions significantly improved medical students' clinical performance in standardised examination, particularly in generic medical history-taking. These findings support integrating validated AI feedback systems in VP platforms for clinical skill training and demonstrate the feasibility of scalable, automated feedback for medical education. The domain-specific improvement in generic medical history components highlights the importance of targeted, competency-specific feedback design in VP education. Clinical Trial: NCT07277829


 Citation

Please cite as:

Borg A, Schiött J, Ivegren W, Gentline C, Huss V, Hugelius AM, Jobs B, Ruiz M, Edelbring S, Georg C, Skantze G, Parodis I

AI-generated Feedback Following Social Robotic Virtual Patient Interactions and Medical Student Performance: Nonrandomized Quasi-Experimental Study

JMIR Med Educ 2026;12:e90368

DOI: 10.2196/90368

PMID: 41881044

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.