Prediction of chronic stress and protective factors in adults: An interpretable prediction model based on XGBoost and SHAP using national DEGS1 data
ABSTRACT
Background:
Chronic stress is highly prevalent in the German population. It has known adverse effects on mental health such as burnout and depression. Known long-term effects of chronic stress are cardiovascular disease, diabetes, and cancer.
Objective:
This study aims to derive a machine learning model for predicting chronic stress levels and protective factors based on representative national data from the German Health Interview and Examination Survey for Adults (DEGS1), which is part of the national health monitoring program.
Methods:
A dataset from the DEGS1 study including demographic, clinical, and laboratory data from 5,801 participants was analyzed. Aiming to compare two machine learning strategies, we trained and validated two classifiers, Random Forest (RF) and the eXtreme Gradient Boosting (XGBoost). The two models’ performances were compared using the Area under the receiver operating characteristic curve (AUC), precision, recall, and the F1 score. Additionally, SHAP (SHapley Additive exPlanations) was used to interpret the prediction models.
Results:
Compared to RF, the XGBoost model had a higher macro-average AUC (81%), precision accuracy (73%), Recall (80%), and F1 score (76%). Important predictor variables for the class of low chronic stress were male gender, very good general health, high satisfaction with living space, and strong social support.
Conclusions:
The XGBoost model provided better results compared with the RF model. SHAP identified relevant protective factors for chronic stress, which need to be considered when developing interventions to reduce chronic stress.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.