Currently submitted to: JMIR Rehabilitation and Assistive Technologies
Date Submitted: May 4, 2026
Open Peer Review Period: May 12, 2026 - Jul 7, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Demographic Bias in Generative Artificial Intelligence Text-to-Image Depiction of Chronic Pain: Cross-Sectional Study
ABSTRACT
Background:
Generative AI platforms are increasingly utilized in healthcare, yet concerns remain about its demographic biases. Chronic pain drives individuals to seek information about diagnosis and treatment online, including AI-generated visual content related to health and disease. Thus, it is essential that AI-generated images accurately capture the demographic diversity of patients with chronic pain to ensure equitable and inclusive healthcare representation.
Objective:
The objective of this study was to examine the capabilities and limitations of generative AI in representing the demographics of patients with chronic pain.
Methods:
This cross-sectional study analyzed demographic characteristics in AI-generated images of chronic pain produced by three AI platforms: DALL·E, MidJourney, and Stable Diffusion. Gender, race, socioeconomic status (SES), and age group representations were assessed. Expert consensus labelling was performed to validate AI-based categorizations, with discrepancies between AI and human evaluations quantified. Logistic regression models were used to identify overrepresentation in demographic categories, with log-odds ratios (log-ORs) calculated to measure biases.
Results:
AI-generated images show significant biases (p < 0.001) across gender, race, SES and age. Males, white individuals, middle-income groups and young adults were consistently overrepresented. DALL·E exhibited the strongest biases, particularly for male (85.9%) and white (81.4%) depictions. MidJourney and Stable Diffusion showed more racial diversity. Low SES groups were absent from DALL·E and Stable Diffusion. Children and older adults were rarely depicted. Expert verification discrepancies were low for gender and SES (<1%) but higher for race (10%).
Conclusions:
Characteristic disparities were identified in AI-generated images of chronic pain. These biases highlight the need for more diverse datasets, broader model validation, and further integration of AI ethics into health equity efforts.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.