JMIR Preprints #93498: Implicit Bias in Large Language Model Diagnosis of Eating Disorders: Experimental Vignette Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Implicit Bias in Large Language Model Diagnosis of Eating Disorders: Experimental Vignette Study

Deija McCalla;
Bochen Li;
Saul Jaeger;
Justin Jacques;
Leo Gonzalez Jr;
Charles Silber;
Cass Dykeman

ABSTRACT

Background:

Large language models are increasingly deployed in mental health applications, yet growing evidence suggests they encode algorithmic biases that influence clinical outputs. Because these models now mediate patient-facing decisions, such biases carry the potential for direct harm. Whether they systematically affect psychiatric diagnosis across demographic groups remains underexplored.

Objective:

To examine whether large language models (LLMs) exhibit implicit demographic biases when generating psychiatric diagnoses.

Methods:

We developed 1,152 synthetic clinical vignettes using a matched-pair design that manipulated gender, race/ethnicity, age, socioeconomic status, English proficiency, and urbanicity while holding clinical content constant. Vignettes were divided into control (unambiguous anorexia nervosa) and ambiguous conditions designed to permit differential diagnosis. Ten LLM configurations across five model families were tested.

Results:

Control vignettes produced near-unanimous anorexia nervosa diagnoses (M = 100.0%), while ambiguous vignettes elicited greater variability (M = 23.6%). Inter-model agreement was moderate for ambiguous vignettes (Fleiss' κ = 0.410, 95% CI: 0.397–0.422). Mixed-effects logistic regression with LLM as a random intercept revealed significant demographic biases: Black patients were over six times more likely to receive a major depressive disorder diagnosis than White patients with identical presentations (OR = 6.09, 95% CI: 5.13–7.24), Latine patients were over nine times more likely (OR = 9.57, 95% CI: 8.00–11.45), and Asian patients were nearly three times more likely to receive an anorexia nervosa diagnosis (OR = 2.88, 95% CI: 2.44–3.42). Female patients were less likely than males to be diagnosed with anorexia nervosa (OR = 0.43, 95% CI: 0.37–0.49).

Conclusions:

These findings demonstrate that LLMs exhibit systematic demographic biases in psychiatric diagnosis even when clinical content is held constant, revealing measurable patterns that can inform improvements to training data, model architecture, and clinical deployment frameworks.

Citation

Please cite as:

McCalla D, Li B, Jaeger S, Jacques J, Gonzalez L Jr, Silber C, Dykeman C

Implicit Bias in Large Language Model Diagnosis of Eating Disorders: Experimental Vignette Study

JMIR Preprints. 13/02/2026:93498

DOI: 10.2196/preprints.93498

URL: https://preprints.jmir.org/preprint/93498

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: JMIR AI

Date Submitted: Feb 13, 2026

Open Peer Review Period: Feb 23, 2026 - Apr 20, 2026

(currently open for review)

Implicit Bias in Large Language Model Diagnosis of Eating Disorders: Experimental Vignette Study

ABSTRACT

Citation

Copyright