JMIR Preprints #40404: Predicting overweight and obesity status among Malaysian working adults: Comparing performance of machine learning with logistic regression

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Predicting overweight and obesity status among Malaysian working adults: Comparing performance of machine learning with logistic regression

Jyh Eiin Wong;
Miwa Yamaguchi;
Nobuo Nishi;
Michihiro Araki;
Lei Hum Lee

ABSTRACT

Background:

Overweight or obesity (OW/OB) is a primary health concern that significantly impact non-communicable disease burdens and threatens the national productivity and economic growth. Given the large complexity of obesity etiology, machine learning (ML) algorithms offer a promising alternative approach in disentangling interdependent factors for OW/OB prediction.

Objective:

This study examined the performance of three ML algorithms and compared them with logistic regression (LR) to predict OW/OB among working adults in Malaysia.

Methods:

Using the data of 16,860 participants (mean age 34.2 ± 9.0 years, 41% males, 41.8% OW/OB) from Malaysia’s Healthiest Workplace survey by AIA Vitality 2019, predictor variables comprising sociodemographic, job characteristics, health and weight perception and lifestyle-related factors were modeled using the Extreme Grading Boosting (XGBoost), Random Forest (RF), Support Vector Machine (SVM) algorithms and LR to predict OW/OB based on a body mass index cut-off of 25.

Results:

The Area under the Receiver Operating Characteristic curve (AUC) were 0.81 (95% confidence interval, CI 0.80, 0.82), 0.80 (95% CI 0.79, 0.81), 0.80 (95% CI 0.78, 0.81) and 0.78 (95% CI 0.77, 0.80) for XGBoost, RF, SVM, and LR models, respectively. Weight satisfaction was the top predictor, together with ethnicity, age and gender as the consistent OW/OB predictor variables for all models.

Conclusions:

Based on multi-domain online workplace survey data, this study produced predictive models that identified OW/OB with moderate-to-high accuracy. The performance of both ML-based and logistic regression models were comparable when predicting obesity among working adults in Malaysia.

Citation

Please cite as:

Wong JE, Yamaguchi M, Nishi N, Araki M, Lee LH

Predicting Overweight and Obesity Status Among Malaysian Working Adults With Machine Learning or Logistic Regression: Retrospective Comparison Study

JMIR Form Res 2022;6(12):e40404

DOI: 10.2196/40404

PMID: 36476813

PMCID: 9773027

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Formative Research

Date Submitted: Jun 20, 2022

Date Accepted: Oct 11, 2022

Predicting overweight and obesity status among Malaysian working adults: Comparing performance of machine learning with logistic regression

ABSTRACT

Citation

Copyright