JMIR Preprints #80801: Does Generative Artificial Intelligence Demonstrate Less Bias in Clinical Decision-Making than Mental Health Professionals?

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Does Generative Artificial Intelligence Demonstrate Less Bias in Clinical Decision-Making than Mental Health Professionals?

Katherine Wislocki;
Sabahat Sami;
Gahl Liberzon;
Alyson Zalta

ABSTRACT

Background:

Trauma exposure is highly prevalent and associated with various health issues. However, healthcare professionals can exhibit trauma-related diagnostic overshadowing bias, leading to misdiagnosis and inadequate treatment of trauma-exposed populations. Generative artificial intelligence (GAI) models are increasingly used in healthcare contexts. No research has examined whether generative artificial intelligence (GAI) demonstrates this bias in decision-making, and how rates of this bias may compare to mental health professionals (MHPs).

Objective:

This study aimed to 1) assess trauma-related diagnostic overshadowing among frontier GAI models, and 2) compare evidence of trauma-related diagnostic overshadowing between frontier GAI models and MHPs.

Methods:

Mental health professionals (N = 232, M age = 43.7) completed an experimental paradigm consisting of two vignettes describing adults presenting with obsessive-compulsive symptoms or substance abuse symptoms. One vignette included a trauma exposure history (i.e., sexual trauma or physical trauma), and one vignette did not include a trauma exposure history. Participants answered questions about their preferences for diagnosis and treatment options for clients within the vignettes. GAI models (Gemini 1.5 Flash, ChatGPT 4o mini, Meta Llama 3) completed the same experimental paradigm, with each block being reviewed by each GAI model 20 times. Independent samples t-tests and chi-square analyses were used to assess diagnostic and treatment decision-making across vignette factors and respondents.

Results:

Similarly to MHPs, GAI models demonstrated some evidence of trauma-related diagnostic overshadowing bias, particularly in Likert ratings for PTSD diagnosis and treatment when sexual trauma was present (p < .001). However, GAI models mostly exhibited less bias than MHPs in both Likert and forced-choice clinical decisions. Compared to MHPs, GAI models assigned significantly greater mean target diagnosis and treatment ratings for OCD cases (p < .001) and target treatment ratings for SUD cases (p < .001) when trauma was present. In forced-choice selections, GAI models were significantly more accurate than MHPs in OCD cases and SUD cases with sexual trauma (p < .001).

Conclusions:

GAI models demonstrate evidence of trauma-related diagnostic overshadowing bias, yet the degree of bias varies by task and model. However, GAI models generally demonstrated less bias than mental health professionals in this experimental paradigm. These findings highlight the importance of understanding GAI biases in mental healthcare. More research into bias reduction strategies and responsible implementation of GAI models in mental healthcare is needed.

Citation

Please cite as:

Wislocki K, Sami S, Liberzon G, Zalta A

Comparing Generative Artificial Intelligence and Mental Health Professionals for Clinical Decision-Making With Trauma-Exposed Populations: Vignette-Based Experimental Study

JMIR Ment Health 2025;12:e80801

DOI: 10.2196/80801

PMID: 41086458

PMCID: 12527320

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Mental Health

Date Submitted: Jul 16, 2025

Open Peer Review Period: Jul 17, 2025 - Sep 11, 2025

Date Accepted: Sep 10, 2025

(closed for review but you can still tweet)

Does Generative Artificial Intelligence Demonstrate Less Bias in Clinical Decision-Making than Mental Health Professionals?

ABSTRACT

Citation

Copyright