Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Jan 25, 2025
Open Peer Review Period: Jan 25, 2025 - Mar 22, 2025
Date Accepted: Apr 3, 2025
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Detecting, Characterizing and Mitigating Implicit and Explicit Racial Biases in Healthcare Datasets with Subgroup Learnability
ABSTRACT
Background:
The growing adoption of diagnostic and prognostic algorithms in healthcare has led to concerns about the perpetuation of algorithmic bias against disadvantaged groups of individuals. Deep learning methods to detect and mitigate bias have revolved around modifying models, optimization strategies, and threshold calibration with varying levels of success and tradeoffs. However, there have been limited substantive efforts to address bias at the level of the data used to generate algorithms in healthcare datasets.
Objective:
We create a simple metric (AEquity) that utilizes a learning curve approximation to distinguish and mitigate bias via guided dataset collection or relabeling.
Methods:
We demonstrate this metric in two well-known examples: chest X-rays and healthcare cost utilization, and detect novel biases in the National Health and Nutrition Examination Survey.
Results:
We demonstrate that utilizing AEquity to guide data-centric collection for each diagnostic finding in the chest radiograph dataset decreased bias by between 29% and 96.5% when measured by differences in area-under-the-curve. When we examined Black patients on Medicaid, at the intersection of race and socioeconomic status, we found that AEquity-based interventions reduced bias across a number of different fairness metrics including overall false negative rate by 33.3% (Bias Reduction Absolute = 1.88 x 10-1; 95% CI (1.4x10-1, 2.5x10-1); Bias Reduction (%) 33.3% (95% CI, 26.6-40.0)), Precision Bias by 7.50x10-2; 95% CI (7.48x10-2, 7.51x10-2); Bias Reduction (%) 94.6% (95% CI, 94.5-94.7%); False Discovery Rate by 94.5% (Absolute Bias Reduction = 3.50x10-2; 95% CI: (3.49x10-2, 3.50x10-2). Similarly, AEquity-guided data collection demonstrates bias reduction of up to 80% on mortality prediction with the National Health and Nutrition Examination Survey (Bias Reduction Absolute = 0.08; 95% CI (0.07, 0.09)). Additionally, we benchmark against balanced empirical risk minimization and calibration and we show that AEquity-guided data collection outperforms both standard approaches. Moreover, we demonstrate that AEquity works on fully connected networks, convolutional neural networks such as ResNet-50, transformer architectures such as on VIT-B-16, an 86 million parameter Vision Transformer, and nonparametric methods such as LightGBM
Conclusions:
In short, we demonstrate AEquity is a robust tool by applying it to different datasets and algorithms, intersectional analyses and measuring its effectiveness with respect to a range of traditional fairness metrics.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.