Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Sep 27, 2021
Date Accepted: Feb 26, 2022
Machine learning modeling for preterm birth prediction using health record: A systematic review
ABSTRACT
Background:
Preterm birth (PTB) as a common pregnancy complication is responsible for 35% of the 3.1 million pregnancy-related deaths each year and significantly impacts around 15 million children annually across the world. Conventional approaches to predict PTB lack reliable predictive power leaving more than 50% of the cases undetected. Recently, machine learning (ML) models have shown the potential as an appropriate complementary approach for PTB prediction using health record (HR).
Objective:
In this article we systematically reviewed the literature concerned with PTB prediction using health record (HR) data and ML modeling.
Methods:
This systematic review was conducted in accordance with the PRISMA statement. A comprehensive search was performed in seven bibliographic databases up until 15 May 2021. The quality of studies was assessed, and the descriptive information including descriptive characteristics of the data, ML modeling processes, and model performance were extracted and reported.
Results:
A total of 732 papers were screened through title and abstract. Of these, 23 studies were screened by full text resulting in 13 papers that met the inclusion criteria. Sample size varied from minimum 274 to maximum 1,400,000. The time length for which data was extracted varied from one to eleven years and the oldest and newest data were related to 1988 and 2018, respectively. Population, dataset, and ML models’ characteristics were assessed, and the performance of the model was often reported based on metrics such as accuracy, sensitivity, specificity, AUC.
Conclusions:
Various ML models used for different HR data indicated potential for PTB prediction. However, evaluation metrics, software/package used, data size and type, and selected features, and importantly data management method often left unjustified threatening the reliability, performance, and internal/external validity of the model. To understand ML usefulness in covering the existing gap, future studies also are suggested to compare it with a conventional method on the same dataset.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.