Accepted for/Published in: JMIR Public Health and Surveillance
Date Submitted: Jan 23, 2025
Date Accepted: Jun 6, 2025
Machine Learning-Based Prediction of Determinants for Cervical Cancer Screening Among Women Aged 30-49 in Sub-Saharan Africa
ABSTRACT
Background:
Cervical cancer is the fourth most prevalent cancer in women, with 660,000 new cases and 350,000 deaths in 2022. If early screening is effectively implemented, it could reduce the overall number of cervical cancer cases by up to 80%, prevent more than 40% of new cases, and save 5 million lives. In today's world, without machine learning, it is impossible to analyze large datasets effectively and use them for decision-making.
Objective:
this to assess a machine learning-based prediction model and identify the key determinants influencing cervical cancer screening uptake among women aged 30-49 in Sub-Saharan Africa
Methods:
For this study, a weighted dataset of 33,952 from the 2022 Demographic and Health Survey (DHS) in Ghana, Kenya, Mozambique, and Tanzania was used. STATA version 17 and Python 3.10 were used for data preprocessing and analysis. MinMax and Standard Scalar were applied for feature scaling, and Recursive Feature Elimination (RFE) was used for feature selection. An 80:20 ratio was applied for data splitting. Tomek Links with Random Over-Sampling were used for handling class imbalance. Seven models were selected and trained using both balanced and unbalanced datasets. Model evaluation was performed using ROC-AUC, accuracy, and confusion matrix.
Results:
Random Forest classifier was ranked as the best among seven algorithms for cervical cancer prediction, showed that wealth status, awareness of STIs, HIV testing, age at first sex, primary education and above, and living in urban areas are significant factors associated with increased cervical cancer screening. However, factors such as not owning a smartphone, having a single sexual partner, and unknown health status are associated with a decrease in cervical cancer screening with an ROC accuracy of 78%, AUC of 86%, and a confusion matrix score of 72.7% on the test data.
Conclusions:
In conclusion, to promote cervical cancer screening in Africa, it is recommended to focus on education and awareness campaigns, establish in place outreach programs and begin screening at health post or community level, and address the digital divide.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.