Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Formative Research

Date Submitted: Jun 13, 2026
Open Peer Review Period: Jun 15, 2026 - Aug 10, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

A Real-Time Artificial Intelligence Diagnostic Copilot in Simulated Primary Care Consultations: Randomized Simulation Study

  • Ivan Cusacovich Sr; 
  • Manuel Pinilla Martin; 
  • Manuel Pinilla Martin

ABSTRACT

Background:

Diagnostic errors in primary care contribute substantially to avoidable morbidity. Real-time artificial intelligence (AI) diagnostic copilots may help physicians broaden diagnostic reasoning during clinical encounters, but early evidence is needed from interactive settings that approximate real consultations while allowing safe assessment before real-world clinical deployment.

Objective:

This study aimed to conduct a formative, simulation-based evaluation of whether a real-time, voice-based AI diagnostic copilot was associated with improved physician diagnostic accuracy and measurable automation bias during simulated primary care consultations.

Methods:

We conducted a case-level randomized, adjudicator-blinded simulation study in a web-based virtual primary care clinic. Thirteen primary care physicians managed 260 simulated voice-based consultations. Cases were randomized to either real-time AI assistance with Medsys AI or a usual-care simulation condition without AI assistance. No real patients were enrolled, no clinical care was delivered or modified, and all outcomes were physician-level performance, time, usability, or automation-bias measures. The primary outcome was diagnostic accuracy, assessed by three blinded adjudicators and analyzed using mixed-effects logistic regression with random intercepts for physician and case.

Results:

AI assistance increased adjusted diagnostic accuracy from 67.3% (95% CI 58.0% - 75.4%) in the usual-care simulation condition to 82.6% (95% CI 72.9% - 90.3%) in the AI-assisted condition (adjusted odds ratio 2.64, 95% CI 1.61 - 4.33; P<.001). The absolute benefit was greatest in the most complex cases (+27.6 percentage points), while relative error-rate reduction was consistent across difficulty quartiles. This corresponded to a simulation-context NNT-equivalent of 6.5 simulated consultations per additional correct physician diagnosis. The AI-assisted condition was also associated with a measurable automation-bias signal, defined here as adoption of incorrect AI suggestions: physicians were more likely to adopt incorrect AI suggestions when the AI’s primary suggestion was wrong (adjusted odds ratio 1.94, 95% CI 1.11 - 3.41; P=.02). Consultation time increased modestly in the AI-assisted condition, and physicians rated the tool highly for usefulness and satisfaction.

Conclusions:

In this randomized simulation study, a real-time AI diagnostic copilot was associated with higher physician diagnostic accuracy during simulated primary care consultations, particularly in complex cases. However, this formative evaluation also identified a measurable automation-bias signal when AI suggestions were incorrect. These findings support further refinement of the human-AI interface and prospective evaluation in real-world clinical settings before clinical implementation. Clinical Trial: Study registration was not applicable because this was a formative, randomized simulation-based physician-performance study using virtual patient cases: no real patients were enrolled, no clinical care was delivered or modified, no identifiable patient health data were used, and no patient health outcomes or patient adverse events were measured. All outcomes were physician-level performance, time, usability, or automation-bias measures.


 Citation

Please cite as:

Cusacovich I Sr, Pinilla Martin M, Pinilla Martin M

A Real-Time Artificial Intelligence Diagnostic Copilot in Simulated Primary Care Consultations: Randomized Simulation Study

JMIR Preprints. 13/06/2026:104579

DOI: 10.2196/preprints.104579

URL: https://preprints.jmir.org/preprint/104579

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.