Currently submitted to: JMIR AI
Date Submitted: May 22, 2026
Open Peer Review Period: May 25, 2026 - Jul 20, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Detecting and Mitigating AI Bias in Healthcare: Development and Validation of a Unified Multi-Stage Framework
ABSTRACT
Background:
AI-driven clinical systems can improve diagnosis, prognosis, and resource allocation, but they may reproduce disparities encoded in historical healthcare data. Existing mitigation methods typically target a single source of bias, while clinical datasets often contain interacting representation, proxy, integrity, and temporal biases.
Objective:
This study develops and evaluates a unified multi-stage framework for detecting and mitigating multiple forms of bias in structured healthcare machine learning data.
Methods:
We designed a compositional pipeline, D_clean = T_temp(T_int(T_proxy(T_repr(D)))), in which each stage conditions on the corrected output of the previous stage. To address cross-dataset heterogeneity, all datasets were first mapped into a prespecified harmonized clinical-concept space with explicit missing-concept masks. The five anchor features used for alignment were race, sex, age_group, income_proxy, and n_prior_visits. The final harmonized representation contained 37 clinical concepts plus 37 corresponding binary mask indicators, yielding a 74-dimensional model input after categorical expansion and mask concatenation. The primary model was trained on the Diabetes 130-US Hospitals dataset. External validation used CMS SynPUF for readmission prediction and NHANES for stage-level fairness and distributional stress testing rather than unsupported direct outcome transfer. Integrity bias was assessed with distributional tests appropriate to each variable type; Benford-style leading-digit analysis was restricted to unbounded count or charge-like variables and was not applied to bounded physiological laboratory values such as HbA1c.
Results:
On the primary Diabetes 130-US Hospitals test split, the proposed pipeline improved AUC from 0.798 to 0.812 and reduced Demographic Parity Difference (DPD) from 0.134 to 0.052. The DPD reduction was statistically significant (bootstrap 95% CI -0.094 to -0.069; paired permutation P < .001). On CMS SynPUF after harmonized concept mapping, DPD decreased from 0.141 to 0.066. NHANES stage-level validation showed improved representation balance and proxy attenuation, while HbA1c integrity checks were evaluated with bounded-variable distributional baselines rather than Benford's Law. Mixture-of-experts processing isolated 7,938 of 101,766 records (7.8%) flagged for integrity concerns and improved fairness without discarding records
Conclusions:
A coordinated data-centric pipeline can improve both fairness and predictive performance when dataset heterogeneity, variable-specific integrity assumptions, and subgroup-specific processing are made explicit. The revised framework resolves the methodological risk of unsupported zero-shot transfer by introducing a harmonized concept layer and reporting outcome validation only where the prediction task and feature space are aligned
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.