Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Oct 7, 2024
Open Peer Review Period: Oct 7, 2024 - Dec 2, 2024
Date Accepted: Mar 24, 2025
(closed for review but you can still tweet)
Population-wide Depression Incidence Forecasting Comparing ARIMA/vector-ARIMA to Temporal Fusion Transformers
ABSTRACT
Background:
Accurate prediction of population-wide depression incidence is vital for effective public mental health management. However, this incidence is often influenced by socio-economic factors, such as population shocks, creating complex structural break scenarios in the time series data. These structural breaks can affect the performance of forecasting methods in various ways. Therefore, understanding and comparing different models across these scenarios is essential.
Objective:
To develop depression incidence forecasting models and compare the performance of ARIMA/vector-ARIMA(VARIMA) and Temporal Fusion Transformers (TFT) under different structural break scenarios.
Methods:
We developed population-wide depression incidence forecasting models and compared the performance of ARIMA/ VARIMA-based methods to TFT-based methods. Using monthly depression incidence from 2002 to 2022 in Hong Kong, we applied sliding windows to segment the whole time series into 72 ten-year sub-samples. The forecasting models were trained, validated and tested on each sub-sample. Within each ten-year subset, the first seven years were used for training, with the eighth year for setting hold-out validation, and the ninth and tenth years for testing. The accuracy on testing set within each ten-year sub-sample was measured by symmetric mean absolute percentage error (SMAPE).
Results:
We found that in sub-samples without significant slope or trend change (structural break), multivariate TFT significantly outperformed univariate TFT, vector-ARIMA (VARIMA), and ARIMA, with an average SMAPE of 0.116 compared to 0.132 (P = 0.011) for univariate TFT, 0.164 (P = 0.002) for VARIMA, and 0.148 (P = 0.003) for ARIMA. Adjusting for the unemployment rate improved TFT performance more effectively than VARIMA. When fluctuating outbreaks happened, TFT was more robust to sharp interruptions, whereas VARIMA/ARIMA performed better when incidence surged and remained high.
Conclusions:
This study represents the effort to compare the forecasting performance of TFT and ARIMA methods for disease incidence. The findings provide guidance on model selection for predicting disease burden under various scenarios, including stable periods and major events with uncertain impacts on epidemiology, such as pandemics and socio-political interruptions.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.