Accepted for/Published in: JMIR Formative Research
Date Submitted: Jun 20, 2022
Date Accepted: Oct 11, 2022
Predicting overweight and obesity status among Malaysian working adults: Comparing performance of machine learning with logistic regression
ABSTRACT
Background:
Overweight or obesity (OW/OB) is a primary health concern that significantly impact non-communicable disease burdens and threatens the national productivity and economic growth. Given the large complexity of obesity etiology, machine learning (ML) algorithms offer a promising alternative approach in disentangling interdependent factors for OW/OB prediction.
Objective:
This study examined the performance of three ML algorithms and compared them with logistic regression (LR) to predict OW/OB among working adults in Malaysia.
Methods:
Using the data of 16,860 participants (mean age 34.2 ± 9.0 years, 41% males, 41.8% OW/OB) from Malaysia’s Healthiest Workplace survey by AIA Vitality 2019, predictor variables comprising sociodemographic, job characteristics, health and weight perception and lifestyle-related factors were modeled using the Extreme Grading Boosting (XGBoost), Random Forest (RF), Support Vector Machine (SVM) algorithms and LR to predict OW/OB based on a body mass index cut-off of 25.
Results:
The Area under the Receiver Operating Characteristic curve (AUC) were 0.81 (95% confidence interval, CI 0.80, 0.82), 0.80 (95% CI 0.79, 0.81), 0.80 (95% CI 0.78, 0.81) and 0.78 (95% CI 0.77, 0.80) for XGBoost, RF, SVM, and LR models, respectively. Weight satisfaction was the top predictor, together with ethnicity, age and gender as the consistent OW/OB predictor variables for all models.
Conclusions:
Based on multi-domain online workplace survey data, this study produced predictive models that identified OW/OB with moderate-to-high accuracy. The performance of both ML-based and logistic regression models were comparable when predicting obesity among working adults in Malaysia.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.