Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: Journal of Medical Internet Research

Date Submitted: Jan 15, 2026

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Are GP Residents Taking Prior Information on AI Accuracy into Account when Receiving AI Assistance? A Randomized Controlled Experiment

  • Christof Schulz; 
  • Laura Zwaan; 
  • Marlies Wakkee; 
  • Justine Staal

ABSTRACT

Background:

Evidence has shown that artificial intelligence can aid clinicians in the diagnosis of skin disease, but factors influencing human-AI collaboration remain underexplored. Successful human-AI collaboration relies on the alignment between a person’s reliance on the tool and its actual capabilities (i.e. trust calibration). However, little is known about which factors shape this interaction. We hypothesized that reliance on AI may be predicted by human performance expectancy and investigated if providing prior information about the statistical performance of AI tools influences the reliance on different AI tools.

Objective:

The influence of prior information about AI performance on AI reliance among GP residents was examined experimentally. Specifically, we assessed whether framing an AI tool as high- versus low-performing affected AI reliance, diagnostic accuracy, and confidence when diagnosing skin lesions.

Methods:

General practice (GP) residents diagnosed 50 clinical images of skin lesions as malignant or benign. Participants first diagnosed the lesions without AI-tool support. They then received a diagnostic suggestion, framed as coming from an AI-tool, and were asked whether they wanted to change their original diagnosis. The images were presented in two (counterbalanced) blocks: In one block, the residents were informed that the AI-tool had a lower accuracy and in the other block, the AI-tool supposedly had a higher accuracy. In reality, the suggestions were mock-ups and equally accurate in both blocks.

Results:

Eighty-three GP residents (Mdn=39 years old, 60 female, 1 non-binary) completed the study. Displaying prior information about AI performance did not affect much participants relied on AI (p=0.319). Overall classification accuracy improved after receiving assistance (OR = 1.34, SE = 0.08, 95%Cl [1.20, 1.50], p<0.001), slightly above the accuracy of humans or AI alone. However, accuracy did not differ between the low and high AI-tool performance instructions (p=0.923). Interestingly, the likelihood of the GP residents to switch to the AI’s recommendation was approximately twice as high when the advice was correct compared to when it was incorrect (OR = 1.99, SE = 0.49, 95% CI [1.23, 3.22], p=0.005), indicating that participants displayed some ability to distinguish between beneficial and incorrect advice. The confidence in the diagnosis, indicated on a 1-10 Likert scale, increased from the baseline (M = 6.81, SE = 0.14) when receiving high-performance instructions (M = 7.05, SE = 0.04, p < 0.001), but no significant increases in confidence were measured with low-performance instructions (M = 6.89, SE = 0.04, p = 0.097).

Conclusions:

Providing GP residents with prior information about AI performance did not affect the residents in changing their diagnosis or in diagnostic accuracy. However, providing high-performance instructions increased the confidence more than low-performance instructions. Providing assistance -regardless of the performance instructions- elevated the residents’ performance in diagnosing skin lesions. The participants also displayed an ability to discriminate between good and incorrect advice. Clinical Trial: This study was pre-registered at the Open Science Framework: https://doi.org/10.17605/OSF.IO/ES7K4


 Citation

Please cite as:

Schulz C, Zwaan L, Wakkee M, Staal J

Are GP Residents Taking Prior Information on AI Accuracy into Account when Receiving AI Assistance? A Randomized Controlled Experiment

JMIR Preprints. 15/01/2026:91512

DOI: 10.2196/preprints.91512

URL: https://preprints.jmir.org/preprint/91512

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.