Accepted for/Published in: JMIR Pediatrics and Parenting
Date Submitted: Apr 29, 2025
Date Accepted: Oct 14, 2025
Modeling Zero-Dose Children in Ethiopia: A Machine Learning Perspective on Model Performance and Predictor Variables
ABSTRACT
Background:
Despite progress in childhood vaccination, many children in Low- and Middle-Income Countries (LMICs), including Ethiopia, remain unvaccinated, presenting a significant public health challenge. The Immunization Agenda 2030 (IA2030) seeks to halve the number of unvaccinated children by identifying at-risk populations, but effective strategies are limited. This study leverages machine learning (ML) to identify Ethiopian children aged 12 to 35 months who are at higher risk of being zero-dose. By analyzing demographic, socio-economic, and healthcare access data, the study developed predictive models using different algorithms. The findings aim to inform targeted interventions, ultimately improving vaccination coverage and health outcomes.
Objective:
This study aimed to develop a machine learning model to predict zero-dose children and to identify the most influential predictors of zero dose in Ethiopia.
Methods:
We examined how well the predictive algorithms can characterize a child at risk of being zero-dose based on predictor variables sourced from the recent national Immunization survey data. We applied supervised machine learning algorithms with the survey data sets, which included 13,666 children aged 12 to 35 months. Model performance was assessed using accuracy, area under the curve, precision, recall and F1 score. We applied Shapley Additive analysis to identify the most important predictors.
Results:
The Light Gradient Boosting Machine (LightGBM), Random Forest, Extreme Gradient Boosting (XGBoost), and AdaBoost classifiers effectively identified most zero-dose (ZD) children as being at high risk. Among these, LightGBM demonstrated the best performance, achieving an accuracy of 93%, an Area Under the Curve(AUC) of 97%, a precision of 94%, and a recall of 91%. The most significant features impacting the model included poor perception of vaccination benefits, lack of antenatal care (ANC) utilization, distance from Immunization services, and absence of maternal Tetanus Toxoid vaccinations.
Conclusions:
The developed machine learning models effectively predict children at risk of being zero-dose, with the LGBM model showing the best performance. This model can guide targeted interventions to reduce zero-dose prevalence and address vaccination inequities. Key predictors include access to Immunization sites, maternal health service utilization, and perceptions of Immunization benefits. By focusing on these vulnerable groups, public health efforts can tackle disparities in vaccination coverage. Enhancing maternal care, raising caregiver awareness, and improving Immunization access through outreach can significantly reduce the number of zero-dose children.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.