Previously submitted to: JMIR AI (no longer under consideration since Jul 28, 2024)
Date Submitted: Aug 18, 2023
Open Peer Review Period: Aug 17, 2023 - Oct 12, 2023
(closed for review but you can still tweet)
NOTE: This is an unreviewed Preprint
Warning: This is a unreviewed preprint (What is a preprint?). Readers are warned that the document has not been peer-reviewed by expert/patient reviewers or an academic editor, may contain misleading claims, and is likely to undergo changes before final publication, if accepted, or may have been rejected/withdrawn (a note "no longer under consideration" will appear above).
Peer review me: Readers with interest and expertise are encouraged to sign up as peer-reviewer, if the paper is within an open peer-review period (in this case, a "Peer Review Me" button to sign up as reviewer is displayed above). All preprints currently open for review are listed here. Outside of the formal open peer-review period we encourage you to tweet about the preprint.
Citation: Please cite this preprint only for review purposes or for grant applications and CVs (if you are the author).
Final version: If our system detects a final peer-reviewed "version of record" (VoR) published in any journal, a link to that VoR will appear below. Readers are then encourage to cite the VoR instead of this preprint.
Settings: If you are the author, you can login and change the preprint display settings, but the preprint URL/DOI is supposed to be stable and citable, so it should not be removed once posted.
Submit: To post your own preprint, simply submit to any JMIR journal, and choose the appropriate settings to expose your submitted version as preprint.
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Auditing Natural Language Processing for Gender Equality in Sub-Saharan African Healthcare Systems: Framework Development and Evaluation
ABSTRACT
Background:
Natural Language Processing models have wide and growing use in clinical and healthcare domains. Such applications enable scalable, efficient delivery of health information, but they are prone to equity challenges in their effectiveness across demographics and contexts. These models are only as good as the data they are trained on, the type of training, and parameters. Moreover, they are highly sensitive to latent demographic signals such as gender, age, nationality, and native language. Applications with biased components lead to inequitable outcomes. These accessibility challenges are more prevalent in rural regions of the world.
Objective:
This paper describes and evaluates a novel active learning approach for incrementally improving the accuracy of a Natural Language Processing (NLP), while optimizing for gender-equitable outcomes in healthcare systems. The approach employs an iterative cyclic model, incorporating data annotation using NLP, human auditing to improve the annotation accuracy especially for data with demographic segmentation, testing on new data (with intentional bias favoring underperforming demographics), and a loopback system for retraining the model and applying it on new data.
Methods:
We describe experimental integration of an audit tool and workflow with distinct NLP tasks in two separate contexts: i.) annotation of medical symptoms collected in Hausa and English languages based on responses to a research questionnaire about health access in Northern Nigeria; ii.) message intent classification in English and Swahili languages based on spontaneous user messages to a health guide chatbot in both Nigeria and Kenya.
Results:
Baseline results showed an equity gap in both precision (P) and recall (R): p=.725 and r=.676 for the over-reprsented class versus p=.669 and r=.651 for the under-represented class. Application of the active learning tool and workflow mitigated this gap after three increments of auditing and retraining (p=.721 and r=.760 for the under-represented class).
Conclusions:
Our findings indicate that this gender-aware audit workflow is language agnostic and capable of mitigating demographic inequity while improving overall system accuracy.
Citation
The author of this paper has made a PDF available, but requires the user to login, or create an account.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.