Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Jan 12, 2022
Date Accepted: Mar 27, 2022

The final, peer-reviewed published version of this preprint can be found here:

Evaluation and Mitigation of Racial Bias in Clinical Machine Learning Models: Scoping Review

Huang J, Galal G, Etemadi M, Vaidyanathan M

Evaluation and Mitigation of Racial Bias in Clinical Machine Learning Models: Scoping Review

JMIR Med Inform 2022;10(5):e36388

DOI: 10.2196/36388

PMID: 35639450

PMCID: 9198828

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Evaluation and Mitigation of Racial Bias in Clinical Machine Learning Models: A Scoping Review

  • Jonathan Huang; 
  • Galal Galal; 
  • Mozziyar Etemadi; 
  • Mahesh Vaidyanathan

ABSTRACT

Background:

Racial bias is a key concern regarding the development, validation, and implementation of machine learning (ML) models in clinical settings. Despite the potential of bias to propagate health disparities, racial bias in clinical ML has yet to be thoroughly examined and best practices for bias mitigation remain unclear.

Objective:

Our objective was to perform a scoping review to characterize the methods by which racial bias of ML has been assessed and describe strategies that may be used to enhance algorithmic fairness in clinical ML.

Methods:

A scoping review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses Extension for Scoping Reviews. Literature search using the PubMed, Scopus, and Embase databases as well as Google Scholar identified 635 records, of which 12 studies were included.

Results:

Applications of ML were varied and involved diagnosis, outcome prediction, and clinical score prediction performed on datasets including images, diagnostic studies, clinical text, and clinical variables. One study (8%) described a model in routine clinical use, two (17%) examined prospectively validated clinical models, and the remaining nine (75%) described internally validated models. Eight studies (75%) concluded that racial bias was present, two (17%) concluded that it was not, and two (17%) assessed the implementation of bias mitigation strategies without comparison to a baseline model. Fairness metrics used to assess algorithmic racial bias were inconsistent. The most commonly observed metrics were: equal opportunity difference (5/12, 42%); accuracy (4/12, 25%); and disparate impact (2/12, 17%). All eight studies (67%) which implemented methods for mitigation of racial bias successfully increased fairness as measured by the authors’ chosen metrics. Pre-processing methods of bias mitigation were the most commonly used across all studies which implemented them.

Conclusions:

The broad scope of medical ML applications and potential patient harms demand an increased emphasis on evaluation and mitigation of racial bias in clinical ML. However, the adoption of algorithmic fairness principles in medicine remains inconsistent and is limited by poor data availability and ML model reporting. We recommend that researchers and journal editors emphasize standardized reporting and data availability in medical ML studies to improve transparency and facilitate evaluation for racial bias.


 Citation

Please cite as:

Huang J, Galal G, Etemadi M, Vaidyanathan M

Evaluation and Mitigation of Racial Bias in Clinical Machine Learning Models: Scoping Review

JMIR Med Inform 2022;10(5):e36388

DOI: 10.2196/36388

PMID: 35639450

PMCID: 9198828

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.