Accepted for/Published in: JMIR Formative Research
Date Submitted: Oct 18, 2023
Open Peer Review Period: Oct 18, 2023 - Dec 13, 2023
Date Accepted: Oct 5, 2024
(closed for review but you can still tweet)
Use of Random Forest to predict adherence in an online intervention for depression using baseline and early usage data: Model Development and Validation on retrospective routine care log-data
ABSTRACT
Background:
Online interventions, such as the iFightDepression tool (iFD tool), are increasingly being used as an efficacious alternative to classical face-to-face psychotherapy or pharmacotherapy for treating depression. But especially when used outside of study settings, low adherence rates and the resulting reduced benefit of the intervention limit their effectiveness. Knowledge of factors predicting adherence would enable early, tailored interventions for those at risk of non-adherence to enhance user engagement and optimise therapeutic outcomes.
Objective:
To identify users at risk of non-completion, this study aimed to develop and evaluate a random forest model predicting adherence to the iFD tool from characteristics obtained during baseline and week one of the intervention in patients with depression.
Methods:
Log-data of 4187 adult patients who have registered to the iFD tool between October 1st, 2016 and May 5th, 2022 and gave informed consent were statistically analysed. The resulting dataset was partitioned into training (70%) and test data (30%) using a randomly stratified split. The training dataset was used to train a random forest model to predict adherence of each user at baseline using the hypothesised predictors (age, self-reported gender, expectations of the intervention, current or preceding depression treatments, confirmed diagnosis of depression, PHQ-9 score at baseline, accompanying guide profession and usage behaviour within the first week). After training, the random forest model was tested on the testing dataset to evaluate its predictive performance and to assess the importance of each variable to the prediction of adherence using mean decrease Accuracy, mean decrease Gini and SHAP values.
Results:
Of all patients evaluated, 1019 (24.3%) were considered adherent based on our predefined definition. An initial random forest model relying solely on sociodemographic and clinical predictors obtained at baseline did not allow for a statistically significant prediction of adherence. After incorporating the first-week usage behaviour of each patient, a significant prediction of adherence was achieved (p < .001). Within this prediction the model achieved an accuracy of 0.82 (95% CI: 0.79-0.84), a F1-score of 0.53, an AUC of 0.83 and a specificity of 0.94 for predicting non-adherent users. The important predictors for adherence were logs, word count on the first workshop’s worksheet and time spent on the tool, all within the first week.
Conclusions:
Our results highlight, that early engagement, particularly the usage behaviour in the first week of the online intervention, has a far greater impact as a predictor for adherence than any sociodemographic or clinical factors. Therefore, examining the usage behaviour within the first week and identifying non-adherers through the algorithm could be beneficial to tailor interventions for improving user adherence in a targeted manner for example through follow-up calls or face-to-face discussions with optimal resource utilization.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.