Currently submitted to: Journal of Medical Internet Research
Date Submitted: May 3, 2026
Open Peer Review Period: May 3, 2026 - Jun 28, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Prediction of Clinically Meaningful Improvement after Internet-Delivered Cognitive Behavioral Therapy for Depression and Anxiety Disorders: Machine Learning–Based Predictive Model Development and Temporal Validation Study
ABSTRACT
Background:
Up to 50% of patients treated with internet-delivered cognitive behavioral therapy (ICBT) for depression and anxiety disorders do not experience clinically significant symptom reduction. Identifying these patients prior to ICBT initiation can optimize treatment effect.
Objective:
The aim of this study was to enhance baseline prediction of clinically meaningful improvement in patients treated with ICBT for common psychiatric disorders in routine care, which could ultimately inform treatment allocation at intake.
Methods:
We developed multimodal predictive models integrating clinical, sociodemographic, and genetic data to predict clinically meaningful improvement in a sample of n=1790 patients treated with ICBT for major depressive disorder, panic disorder, and social anxiety disorder. Only data available pre-treatment were used to enable baseline prediction. We applied machine learning algorithms of varying complexity (logistic regression, random forest, XGBoost, support vector machines, soft voting, and stacking ensemble), with nested cross-validation, elastic net variable selection, multiple imputation, and temporal validation in a 20% holdout test set (n=356). The primary performance measure was area under the receiver operating characteristic curve (AUC).
Results:
All models showed comparable performance, with random forest achieving the best discrimination (AUCtest 0.749, 95% CI [0.698, 0.797]). Models that included data from national registers outperformed a benchmark model based on self-reported screening data (AUCtest 0.732–0.749 vs 0.695), whereas polygenic scores added no independent predictive value (DeLong test P=.966).
Conclusions:
These promising results provide a foundation for a future prospective trial to ascertain that baseline prediction can effectively guide tailored interventions for at-risk patients.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.