Accepted for/Published in: JMIR AI
Date Submitted: Oct 30, 2024
Date Accepted: Nov 17, 2025
Date Submitted to PubMed: Dec 8, 2025
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
XAI-Driven Comparative Analysis of Machine Learning Models for Predicting HIV Viral Suppression in Ugandan Patients
ABSTRACT
Background:
HIV viral suppression is essential for improving health outcomes and reducing transmission rates amongst people living with HIV (PLWH). In Uganda, where HIV/AIDS is a major public health concern, machine learning (ML) models can predict viral suppression effectively. However, limited use of explainable AI (XAI) methods affects model transparency and clinical utility.
Objective:
This study aimed to develop and compare ML models for predicting viral non-suppression in Ugandan PLWH on antiretroviral therapy (ART). The best-performing model was used to apply XAI techniques to identify key predictors of viral non-suppression, enhancing model transparency and enabling personalised predictions.
Methods:
We retrospectively analysed clinical and demographic data from 1101 Ugandan PLWH on ART at the HIV clinic in Muyembe HCIV between June 2016 and April 2018, focusing on predicting viral non-suppression (viral load >1000 copies/mL). The dataset was divided into model-building (training: 80%) and validation (test: 20%) sets. To address class imbalance, the synthetic minority over-sampling technique (SMOTE) was applied. For global explanation, eight machine learning algorithms—logistic regression, stacked ensemble, random forest, support vector machines, extreme gradient boosting, k-nearest neighbours, naïve Bayes and artificial neural networks—were compared. Model performance was evaluated using metrics such as accuracy, precision, recall, F1 score, Cohen's kappa and AUC. For local explanation, individual conditional expectation (ICE) plots, SHapley Additive exPlanations (SHAP), break-down and SHAP force plots were used to provide insights into predictions for individual patients.
Results:
The XGBoost model achieved the best performance, with an accuracy of 0.89, precision of 0.63, recall of 0.61, Cohen's kappa of 0.56 and AUC of 0.78. It had a specificity of 0.94 and an F1 score of 0.62, reflecting balanced performance in predicting viral suppression. SHAP analysis identified adherence over the last three months as the most critical predictor of viral non-suppression. Poor adherence was associated with higher rates of non-suppression. Other key predictors included WHO clinical stage, ART supporter relationships (caregiver and relationships), and weight at ART initiation. Marital status, ART duration, and point of entry into the ART clinic (maternity) also influenced predictions. Local explanations revealed poor adherence as a driver for true positive and false positive cases.
Conclusions:
The XGBoost model showed the highest performance in predicting viral suppression amongst Ugandan PLWH on ART, with adherence as the most important predictor of non-suppression. XAI methods provided transparency into the model's decision-making process, enhancing clinical trust and guiding personalised interventions to improve HIV care outcomes.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.