Currently submitted to: JMIR Formative Research
Date Submitted: Jun 13, 2026
Open Peer Review Period: Jun 15, 2026 - Aug 10, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
A Real-Time Artificial Intelligence Diagnostic Copilot in Simulated Primary Care Consultations: Randomized Simulation Study
ABSTRACT
Background:
Diagnostic errors in primary care contribute substantially to avoidable morbidity. Real-time artificial intelligence (AI) diagnostic copilots may help physicians broaden diagnostic reasoning during clinical encounters, but early evidence is needed from interactive settings that approximate real consultations while allowing safe assessment before real-world clinical deployment.
Objective:
This study aimed to conduct a formative, simulation-based evaluation of whether a real-time, voice-based AI diagnostic copilot was associated with improved physician diagnostic accuracy and measurable automation bias during simulated primary care consultations.
Methods:
We conducted a case-level randomized, adjudicator-blinded simulation study in a web-based virtual primary care clinic. Thirteen primary care physicians managed 260 simulated voice-based consultations. Cases were randomized to either real-time AI assistance with Medsys AI or a usual-care simulation condition without AI assistance. No real patients were enrolled, no clinical care was delivered or modified, and all outcomes were physician-level performance, time, usability, or automation-bias measures. The primary outcome was diagnostic accuracy, assessed by three blinded adjudicators and analyzed using mixed-effects logistic regression with random intercepts for physician and case.
Results:
AI assistance increased adjusted diagnostic accuracy from 67.3% (95% CI 58.0% - 75.4%) in the usual-care simulation condition to 82.6% (95% CI 72.9% - 90.3%) in the AI-assisted condition (adjusted odds ratio 2.64, 95% CI 1.61 - 4.33; P<.001). The absolute benefit was greatest in the most complex cases (+27.6 percentage points), while relative error-rate reduction was consistent across difficulty quartiles. This corresponded to a simulation-context NNT-equivalent of 6.5 simulated consultations per additional correct physician diagnosis. The AI-assisted condition was also associated with a measurable automation-bias signal, defined here as adoption of incorrect AI suggestions: physicians were more likely to adopt incorrect AI suggestions when the AI’s primary suggestion was wrong (adjusted odds ratio 1.94, 95% CI 1.11 - 3.41; P=.02). Consultation time increased modestly in the AI-assisted condition, and physicians rated the tool highly for usefulness and satisfaction.
Conclusions:
In this randomized simulation study, a real-time AI diagnostic copilot was associated with higher physician diagnostic accuracy during simulated primary care consultations, particularly in complex cases. However, this formative evaluation also identified a measurable automation-bias signal when AI suggestions were incorrect. These findings support further refinement of the human-AI interface and prospective evaluation in real-world clinical settings before clinical implementation. Clinical Trial: Study registration was not applicable because this was a formative, randomized simulation-based physician-performance study using virtual patient cases: no real patients were enrolled, no clinical care was delivered or modified, no identifiable patient health data were used, and no patient health outcomes or patient adverse events were measured. All outcomes were physician-level performance, time, usability, or automation-bias measures.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.