Accepted for/Published in: JMIR mHealth and uHealth
Date Submitted: Nov 15, 2022
Open Peer Review Period: Nov 15, 2022 - Jan 10, 2023
Date Accepted: Jan 5, 2023
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Development and validation of multivariable prediction algorithms to estimate future walking behavior in adults: a retrospective cohort study
ABSTRACT
Background:
Physical inactivity is associated with numerous health risks, including cancer, cardiovascular disease, type 2 diabetes, increased healthcare expenditure, and preventable, premature deaths. The majority of Americans fall short of clinical guideline goals (i.e., 8,000-10,000 steps per day). Behavior prediction algorithms could enable efficacious interventions to promote physical activity by facilitating delivery of nudges at appropriate times.
Objective:
To develop and validate algorithms that predict walking (i.e., >5 minutes) within the next 3 hours, predicted from the participants’ previous five weeks’ steps per minute data.
Methods:
We conducted a retrospective, closed cohort, secondary analysis of a 6-week Micro Randomization Trial (MRT) of the HeartSteps mHealth physical-activity intervention conducted in 2015. The prediction performance of six algorithms was evaluated: logistic regression, radial-basis function support vector machine, eXtreme Gradient Boosting (XGBoost), multi-layered perceptron (MLP), decision tree, and random forest. For the MLP, 90 random layer architectures were tested for optimization. Prior 5-week hourly walking data, including missingness, was used for predictors. Whether the participant walked during the next 3 hours was used as the outcome. K-fold cross-validation (K=10) was used for the internal validation. The primary outcome measures are classification accuracy, Mathew’s correlation coefficient (MCC), sensitivity, and specificity.
Results:
The total sample size included six weeks of data among 44 participants. The majority of participants were female (70.5%), White (59.1%), had a high-school degree or higher (52.3%), and were married (34.1%). The mean age was 35.9 years old (SD=14.7). Participants who did not have enough data (number of days < 10, n=3) were excluded, resulting in 41 participants. MLP with optimized layer architecture showed the best performance in accuracy (82.0±1.1%), whereas XGBoost (76.3±1.5%), Random Forest (69.5±1.0%), Support Vector Machine (69.3±1.0%), Decision tree (63.6±1.5%) algorithms showed lower performance than logistic regression (77.2±1.2%). MLP also showed superior overall performance to all other tried algorithms in MCC (0.643±0.021), sensitivity (86.1±3.0%), and specificity (77.8±3.3%).
Conclusions:
Walking behavior prediction models were developed and validated. MLP showed the highest overall performance of all attempted algorithms. A random search for optimal layer structure is a promising approach for prediction engine development. Future studies can test the real-world application of this algorithm in a ‘smart’ intervention for promoting physical activity. Clinical Trial: N/A
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.