JMIR Preprints #59439: Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias

Yu He Ke;
Rui Yang;
Sui An Lie;
Taylor Xin Yi Lim;
Yilin Ning;
Irene Li;
Hairil Rizal Abdullah;
Daniel Shu Wei Ting;
Nan Liu

ABSTRACT

Background:

Cognitive biases in clinical decision-making significantly contribute to errors in diagnosis and suboptimal patient outcomes. Addressing these biases presents a formidable challenge in the medical field.

Objective:

This study explores the role of large language models (LLMs) in mitigating these biases through the utilization of the multi-agent framework. We simulate the clinical decision-making processes through multi-agent conversation and evaluate its efficacy in improving diagnostic accuracy compared to humans.

Methods:

A total of 16 published and unpublished case reports where cognitive biases have resulted in misdiagnoses were identified from the literature. In the multi-agent framework, we leveraged GPT-4 to facilitate interactions among different simulated agents to replicate clinical team dynamics. Each agent was assigned a distinct role: 1) making the final diagnosis after considering the discussions, 2) acting as a devil’s advocate to correct confirmation and anchoring biases, 3) serving as a field expert in the required medical subspecialty, 4) facilitating discussions to mitigate premature closure bias, and 5) recording and summarizing findings. We tested varying combinations of these agents within the framework to determine which configuration yielded the highest rate of correct final diagnoses. Each scenario was repeated 5 times for consistency. The accuracy of the initial diagnoses and the final differential diagnoses were evaluated, and comparisons with human-generated answers were made using Fisher’s exact test.

Results:

A total of 240 responses were evaluated (3 different multi-agent frameworks). The initial diagnosis had an accuracy of 0% (0/80). However, following multi-agent discussions, the accuracy for the top two differential diagnoses increased to 76.3% for the best-performing multi-agent framework (Framework 4-C). This was significantly higher compared to the accuracy achieved by human evaluators (OR=3.49, p=0.002).

Conclusions:

The multi-agent framework demonstrated an ability to re-evaluate and correct misconceptions, even in scenarios with misleading initial investigations. Additionally, the LLM-driven multi-agent conversation framework shows promise in enhancing diagnostic accuracy in diagnostically challenging medical scenarios.

Citation

Please cite as:

Ke YH, Yang R, Lie SA, Lim TXY, Ning Y, Li I, Abdullah HR, Ting DSW, Liu N

Mitigating Cognitive Biases in Clinical Decision-Making Through Multi-Agent Conversations Using Large Language Models: Simulation Study

J Med Internet Res 2024;26:e59439

DOI: 10.2196/59439

PMID: 39561363

PMCID: 11615553

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Apr 12, 2024

Date Accepted: Sep 12, 2024

Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias

ABSTRACT

Citation

Copyright