JMIR Preprints #85770: Patient Cognitive Bias in Large Language Model–Supported Health Consultations: A Simulation-Based Comparative Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Patient Cognitive Bias in Large Language Model–Supported Health Consultations: A Simulation-Based Comparative Study

Yi Zuo;
Qifeng Wan;
Shalong Wang

ABSTRACT

Background:

Large language models (LLMs) are increasingly used by patients for health information and preliminary medical advice. In patient-facing consultations, users may present explicitly stated diagnostic preferences or symptom narratives emphasizing a preferred explanation. Such cognitively biased input constrains the diagnostic context available to the model and may systematically steer its reasoning during interactive LLM-supported health consultations.

Objective:

To quantify the impact of patient cognitive bias on LLM diagnostic performance in multi-turn consultations, to assess the effectiveness of prompt-based mitigation strategies and decoding temperature adjustment, and to evaluate a dual-system framework for improving robustness under biased interaction.

Methods:

We developed a simulated patient agent to generate both unbiased and cognitively biased consultations using 1,273 MedQA-USMLE cases. Six widely used LLMs of varying capacity were evaluated through three-round, multi-turn dialogues, after which each model produced a final diagnostic judgment based on the complete consultation record. Diagnostic accuracy was the primary outcome. Secondary outcomes included bias-induced accuracy decline (BIAD; absolute reduction in accuracy under biased versus standard consultations) and bias-influenced error proportion (BIEP; proportion of incorrect responses aligned with the patient’s preferred but incorrect diagnosis). Four prompt-based mitigation strategies and four decoding temperature settings were tested. In addition, a dual-system framework was evaluated, in which a conversational foundation LLM conducted patient interaction and history taking (System 1), while a reasoning-oriented LLM (o1-Mini) generated the final diagnostic judgment (System 2). In the foundation-only condition, the same LLM performed both interaction and diagnosis.

Results:

Across all six evaluated models, cognitively biased consultations led to marked diagnostic accuracy declines of approximately 8 to 39 percentage points compared with standard multi-turn consultations (P < .001), whereas static single-response tests and standard consultations showed comparable accuracy. Larger deteriorations were observed in lower-capacity models, with some approaching random-guess performance under bias. Errors were frequently aligned with patient bias, with BIEP exceeding one-third across models, indicating systematic conformity rather than random error. Prompt-based mitigation strategies and decoding temperature reduction yielded limited and inconsistent improvements and did not reliably prevent bias-induced performance loss. By contrast, the dual-system framework substantially improved diagnostic accuracy under biased conditions in most models, producing gains of approximately 10 to 39 percentage points and recovering a large proportion of the performance lost due to bias (P < .001), particularly in lower-capacity systems.

Conclusions:

Patient-driven cognitive bias represents an underrecognized behavioral risk in LLM-supported health consultations. Common mitigation approaches such as prompt engineering or decoding parameter adjustment provide limited resilience. Explicitly separating conversational interaction from deliberative diagnostic reasoning through a dual-system architecture enables more robust diagnostic performance under biased input while preserving fluent patient-facing dialogue, offering a scalable design strategy for safer medical AI systems.

Citation

Please cite as:

Zuo Y, Wan Q, Wang S

Patient Cognitive Bias in Large Language Model–Supported Health Consultations: Simulation-Based Comparative Study

J Med Internet Res 2026;28:e85770

DOI: 10.2196/85770

PMID: 42275635

PMCID: 13258194

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Oct 13, 2025

Open Peer Review Period: Oct 13, 2025 - Dec 8, 2025

Date Accepted: May 4, 2026

(closed for review but you can still tweet)

Patient Cognitive Bias in Large Language Model–Supported Health Consultations: A Simulation-Based Comparative Study

ABSTRACT

Citation

Copyright