JMIR Preprints #71677: Machine Learning-Based Prediction of Determinants for Cervical Cancer Screening Among Women Aged 30-49 in Sub-Saharan Africa

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Machine Learning-Based Prediction of Determinants for Cervical Cancer Screening Among Women Aged 30-49 in Sub-Saharan Africa

Nebebe Demis Baykemagn;
Mekuriaw Nibret Aweke;
Amare Mesfin;
Lemlem Daniel Baffa;
Muluken Chanie Agimas;
Habtamu Wagnew Abuhay;
Dagnew Getnet Adugna;
Tewodros Getaneh Alemu;
Alemu Teshale Bicha;
Gebrie Getu Alemu

ABSTRACT

Background:

Cervical cancer is the fourth most prevalent cancer in women, with 660,000 new cases and 350,000 deaths in 2022. If early screening is effectively implemented, it could reduce the overall number of cervical cancer cases by up to 80%, prevent more than 40% of new cases, and save 5 million lives. In today's world, without machine learning, it is impossible to analyze large datasets effectively and use them for decision-making.

Objective:

this to assess a machine learning-based prediction model and identify the key determinants influencing cervical cancer screening uptake among women aged 30-49 in Sub-Saharan Africa

Methods:

For this study, a weighted dataset of 33,952 from the 2022 Demographic and Health Survey (DHS) in Ghana, Kenya, Mozambique, and Tanzania was used. STATA version 17 and Python 3.10 were used for data preprocessing and analysis. MinMax and Standard Scalar were applied for feature scaling, and Recursive Feature Elimination (RFE) was used for feature selection. An 80:20 ratio was applied for data splitting. Tomek Links with Random Over-Sampling were used for handling class imbalance. Seven models were selected and trained using both balanced and unbalanced datasets. Model evaluation was performed using ROC-AUC, accuracy, and confusion matrix.

Results:

Random Forest classifier was ranked as the best among seven algorithms for cervical cancer prediction, showed that wealth status, awareness of STIs, HIV testing, age at first sex, primary education and above, and living in urban areas are significant factors associated with increased cervical cancer screening. However, factors such as not owning a smartphone, having a single sexual partner, and unknown health status are associated with a decrease in cervical cancer screening with an ROC accuracy of 78%, AUC of 86%, and a confusion matrix score of 72.7% on the test data.

Conclusions:

In conclusion, to promote cervical cancer screening in Africa, it is recommended to focus on education and awareness campaigns, establish in place outreach programs and begin screening at health post or community level, and address the digital divide.

Citation

Please cite as:

Baykemagn ND, Aweke MN, Mesfin A, Baffa LD, Agimas MC, Abuhay HW, Adugna DG, Alemu TG, Bicha AT, Alemu GG

Identifying Predictors of Cervical Cancer Screening Uptake in Sub-Saharan Africa Using Machine Learning: Cross-Sectional Study

JMIR Public Health Surveill 2025;11:e71677

DOI: 10.2196/71677

PMID: 40961361

PMCID: 12443358

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Jan 23, 2025

Date Accepted: Jun 6, 2025

Machine Learning-Based Prediction of Determinants for Cervical Cancer Screening Among Women Aged 30-49 in Sub-Saharan Africa

ABSTRACT

Citation

Copyright