Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Mar 11, 2024
Open Peer Review Period: Mar 14, 2024 - May 14, 2024
Date Accepted: Oct 1, 2024
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Ensuring Appropriate Representation in Artificial Intelligence–Generated Medical Imagery: Protocol for a Methodological Approach to Address Skin Tone Bias

O'Malley A, Veenhuizen M, Ahmed A

Ensuring Appropriate Representation in Artificial Intelligence–Generated Medical Imagery: Protocol for a Methodological Approach to Address Skin Tone Bias

JMIR AI 2024;3:e58275

DOI: 10.2196/58275

PMID: 39602221

PMCID: 11635324

Ensuring appropriate representation in AI-generated medical imagery: A Methodological Approach to Address Skin Tone Bias

  • Andrew O'Malley; 
  • Miriam Veenhuizen; 
  • Ayla Ahmed

ABSTRACT

Background:

In medical education, particularly in anatomy and dermatology, generative artificial intelligence (AI) can be used to create customized illustrations. However, the underrepresentation of darker skin tones in medical textbooks and elsewhere, which serve as training data for AI, poses a significant challenge in ensuring diverse and inclusive educational materials.

Objective:

This study aims to evaluate the extent of skin tone diversity in AI-generated medical images and to test whether the representation of skin tones can be improved by modifying AI prompts to better reflect the demographic makeup of the US population.

Methods:

Two standard AI models (Dall-E and Midjourney) each generated 100 images of people with psoriasis. Additionally, a custom model was developed which incorporated a prompt injection aimed at “forcing” the AI (Dall-E 3) to reflect the skin tone distribution of the US population according to the 2012 American National Election Survey. This custom model generated another set of 100 images. The skin tones in these images were assessed by three researchers using the New Immigrant Survey skin tone scale, with the median value representing each image. A Chi-Square Goodness of Fit analysis compared the skin tone distributions from each set of images to that of the US population.

Results:

The standard AI models (Dalle-3 and Midjourney) demonstrated a significant difference between the expected skin tones of the US population and the observed tones in the generated images (P=8.62E-11 and P=1.12E-21 respectively). Both standard AI models over-represented lighter skin. Conversely, the custom model with the modified prompt yielded a distribution of skin tones that closely matched the expected demographic representation, showing no significant difference (P=0.0435).

Conclusions:

This study reveals a notable bias in AI-generated medical images, predominantly underrepresenting darker skin tones. This bias can be effectively addressed by modifying AI prompts to incorporate real-life demographic distributions. The findings emphasize the need for conscious efforts in AI development to ensure diverse and representative outputs, particularly in educational and medical contexts. Users of generative AI tools should be aware that these biases exist, and that similar tendencies may also exist in other types of generative AI (e.g. large language models) and in other characteristics (e.g. sex/gender, culture/ethnicity). Injecting demographic data into AI prompts can effectively counteract these biases, ensuring a more accurate representation of the general population.


 Citation

Please cite as:

O'Malley A, Veenhuizen M, Ahmed A

Ensuring Appropriate Representation in Artificial Intelligence–Generated Medical Imagery: Protocol for a Methodological Approach to Address Skin Tone Bias

JMIR AI 2024;3:e58275

DOI: 10.2196/58275

PMID: 39602221

PMCID: 11635324

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.