Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: May 27, 2025
Date Accepted: Nov 10, 2025
Detecting Sociodemographic Biases in the Content and Quality of LLM-Generated Nursing Care: A Cross-Sectional Simulation Study
ABSTRACT
Background:
Large language models (LLMs) are increasingly applied in healthcare. However, concerns remain that their nursing care recommendations may reflect patients’ sociodemographic attributes rather than clinical needs.
Objective:
To investigate potential biases in nursing care plans generated by LLMs, we focused on whether outputs differ systematically based on patients’ sociodemographic characteristics and assessed the implications for equitable nursing care.
Methods:
We utilized a standardized clinical scenario with GPT to generate care plans for 96 sociodemographic identity combinations, drawing on 9,600 tests. We conducted statistical analyses (t-tests and ANOVA) to analyze how text length and the frequency of physiological and psychological nursing terms varied across sociodemographic factors. Additionally, we utilized Python for data processing and visualization to ensure methodological rigor throughout the study.
Results:
The analysis revealed significant sociodemographic biases in LLMs-generated nursing care plans. Female patients received shorter care plans (t = 4.864, P < 0.001) and fewer physiological nursing terms (t = 4.114, P < 0.001). Middle-aged patients got the shortest care plans (F = 4.124, P = 0.006), while younger patients had the lowest frequency of psychological nursing terms (F = 4.834, P = 0.002) compared to older adults. Patients with higher education received shorter care plans (t = -4.202, P < 0.001) and fewer psychological nursing terms (t = -5.724, P < 0.001). Patients from urban areasgot shorter care plans (t = -5.388, P < 0.001) and lowerfrequencies of both physiological (t = -5.180, P < 0.001) and psychological terms (t = -7.689, P < 0.001). High-income patients received the most concise care plans (F = 14.138, P < 0.001) and exhibited the lowest psychological terms frequency (F = 28.600, P < 0.001). These findings demonstrated systematic sociodemographic disparities in LLM-based care recommendations.
Conclusions:
This study identified significant sociodemographic disparities in LLM-generated nursing care plans, with historically privileged populations (e.g., urban, high-income, and highly educated groups) receiving disproportionately fewer clinical interventions. This research offers the first empirical evidence of fairness-related concerns in the nursing domain. Moreover, it contributes a replicable evaluation framework for detecting bias in LLMs and underscores the need for inclusive model design, transparent validation protocols, and sustained human oversight to ensure equitable care outcomes in AI-assisted clinical settings.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.