Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Nov 14, 2024
Date Accepted: Apr 3, 2025
Patient reactions to AI-clinician discrepancies: Web-based randomized experiment
ABSTRACT
Background:
As FDA-approved AI for medical imaging rises, radiologists are increasingly integrating AI into their clinical practices. In lung cancer screening, diagnostic AI offers a second set of eyes with the potential to detect cancer earlier than human radiologists. Despite AI’s promise, a potential problem with its integration is the erosion of patient confidence in clinician expertise when there is a discrepancy between the radiologist’s and the AI’s interpretation of the imaging test.
Objective:
This study aims to examine how discrepancies between AI-derived recommendations and radiologists’ recommendations affect patients’ agreement with radiologists’ recommendations and satisfaction in their radiologists. We also analyze how patients’ medical maximizing-minimizing (MMM) preferences moderate these relationships.
Methods:
We conducted a randomized, between-subjects experiment with 1,606 U.S. adult participants. Assuming the role of patients, participants imagined undergoing a low-dose computerized tomography scan for lung cancer screening and receiving results and recommendations from: (1) a radiologist only, (2) AI and a radiologist in agreement, (3) a radiologist who recommended more testing than AI (i.e., radiologist overcalled AI), or (4) a radiologist who recommended less testing than AI (i.e., radiologist undercalled AI). Participants rated the radiologist on three criteria: agreement with the radiologist’s recommendation, how likely they would be to recommend the radiologist to family and friends, and how good of a provider they perceived the radiologist to be. We also measured MMM and categorized participants as maximizers (i.e., those who seek aggressive intervention), minimizers (i.e., those who prefer no or passive intervention), and neutrals (i.e., those in the middle).
Results:
Participants’ agreement with the radiologist’s recommendation was significantly lower when the radiologist undercalled AI (M=4.01, SE=0.07) vs. overcalled (M=4.63, SE=0.06), agreed with (M=4.55, SE=0.07), or had no AI (M=4.57, SE=0.06; P<.001). Additionally, participants were least likely to recommend (P<.001) and positively rate (P<.001) the radiologist who undercalled AI. There were no significant differences among the other conditions. Maximizers strongly agreed (b=0.82, SE=0.14; P<.001) with the radiologist who overcalled AI whereas minimizers strongly disagreed (b=-0.43, SE=0.18, P=.015), compared to neutrals. Moreover, maximizers strongly disagreed with the radiologist who undercalled AI (b=-0.47, SE=0.14; P=.001).
Conclusions:
Radiologists who recommend less testing than AI may face decreased patient confidence in their expertise, but they may not face this same penalty for giving more aggressive recommendations than AI. Patients’ reactions may depend in part on whether their general preferences to maximize or minimize align with the radiologists’ recommendations.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.