Accepted for/Published in: JMIR Public Health and Surveillance
Date Submitted: Nov 25, 2024
Date Accepted: Mar 25, 2025
Development of risk prediction models to identify the risk of infant rapid weight gain using machine learning algorithms and data from seven cohorts
ABSTRACT
Background:
Rapid weight gain (RWG) during infancy, defined as upward crossing of one centile line on a weight growth chart, is highly predictive of subsequent obesity risk. Identification of infant RWG could facilitate obesity risk assessment from infancy.
Objective:
Leveraging machine learning algorithms, this study aimed to develop and validate risk prediction models to identify infant RWG by age one year.
Methods:
Data from seven Australian and New Zealand cohorts were pooled for risk model development and validation (n=5233). Eight machine learning algorithms predicted infant RWG using routinely available prenatal and early postnatal factors including maternal pre-pregnancy weight status, maternal smoking during pregnancy, gestational age, parity, infant sex, birth weight, any breastfeeding and timing of solids introduction at age 6 months. Pooled data were randomly split into a training dataset (70%) and a test dataset (30%) for model training and validation, respectively. Model consistency was evaluated using five-fold cross-validation. Model predictive performance was evaluated by area under the curve (AUC), accuracy, precision, sensitivity, and specificity.
Results:
The average prevalence of infant RWG was 27%. In the training dataset, all machine learning algorithms showed acceptable to excellent discrimination with AUCs ranging from 0.75 to 0.85. Accuracy, which indicates the overall correctness of the model, ranged from 0.69 to 0.78. Precision, which measures the model's ability to avoid false positives, ranged from 0.68 to 0.77. The spread of sensitivity and specificity of all models was 0.68 - 0.80 and 0.65 - 0.78, respectively. Random Forest and Gradient Boosting Classifiers showed the most favourable predictive accuracy. Model validation in the testing dataset also showed good to excellent predictive performance with all metrics exceeding 0.74.
Conclusions:
This study developed the first machine learning based risk prediction models to identify infant’s risk of experiencing RWG by age one year with good accuracy. The models could be feasibly integrated into routine child growth monitoring and may facilitate population wide early obesity risk assessment in primary health care.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.