Accepted for/Published in: JMIR Formative Research
Date Submitted: Jul 16, 2025
Open Peer Review Period: Jul 17, 2025 - Sep 11, 2025
Date Accepted: Oct 15, 2025
(closed for review but you can still tweet)
Can GPT Generate Medical Dialogue for Clinical Vignettes: An Evaluation
ABSTRACT
Background:
Clinical vignettes often focus on prototypical presentations; require substantial time and effort to develop; and fail to represent patient diversity, the complexity of clinical conditions, patients’ perspectives, and the dynamic nature of physician–patient interactions.
Objective:
We evaluated the quality of physician–patient dialogues produced by generative AI in Japanese, focusing on their medical accuracy and overall appropriateness as medical interviews.
Methods:
We created an AI prompt that included a specific clinical history and instructed the model to simulate a cooperative patient responding to the physician’s questions to generate a physician–patient dialogue. The target diseases were those covered by the Japanese National Medical Licensing Examination. Each dialogue consisted of 25 turns by the physician and 25 by the patient, reflecting the typical volume of conversation in Japanese outpatient settings. Three internists independently evaluated each generated dialogue using a 7-point Likert scale across six criteria: coherence of the conversation, medical accuracy of the patient’s responses, medical accuracy of the physician’s responses, content of the medical history, communication skills, and professionalism. In addition, the composite score for each dialogue was calculated as the overall mean of these six criteria.
Results:
The mean scores (standard deviation) for the six criteria were as follows: coherence of the conversation: 5.9 (0.9); medical accuracy of the patient’s responses: 6.0 (0.9); medical accuracy of the physician’s responses: 5.6 (1.1); content of the medical history taking: 5.9 (0.9); communication skills: 5.6 (0.9); and professionalism: 5.5 (1.1). The composite score was 5.7 (1.0).
Conclusions:
While physician oversight remains essential, it is feasible to efficiently create AI-generated educational materials for medical education that overcome the limitations of traditional clinical vignettes. This approach may reduce time and financial burdens, enhancing opportunities to practice clinical interviewing in settings that closely mirror real-world encounters.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.