Accepted for/Published in: JMIR Formative Research
Date Submitted: Oct 14, 2025
Date Accepted: Jan 9, 2026
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Predictive Modeling of Enterovirus Hospital Burden Using Machine Learning and Age-Specific Surveillance Data: A Multi-Setting Analysis from Taiwan (2008-2024)
ABSTRACT
Background:
Background:
Enterovirus infections, particularly hand, foot, and mouth disease (HFMD), pose a significant public health burden in Taiwan, with seasonal hospitalization peaks in spring and early summer. Accurate forecasting of hospital burden is essential for effective resource allocation and public health preparedness, achieving 6.8% mean absolute percentage error for 1-week ahead forecasts in this study.
Objective:
Objective:
To develop and evaluate machine learning (ML) models using age-specific surveillance data to predict enterovirus-related hospitalizations 1-4 weeks in advance, informing public health decision-making.
Methods:
Methods:
We analyzed weekly surveillance data (2008-2024) from Taiwan's Centers for Disease Control, including outpatient, emergency, and hospitalization counts across five age groups. Support Vector Machine (SVM), XGBoost, and Random Forest (RF) models were trained on 85% of data (2008-2021) and tested on 15% (2022-2024). Performance was assessed using R-squared (R²), root mean square error (RMSE), and mean absolute percentage error (MAPE) for 1-, 2-, and 4-week-ahead predictions. Age-specific features and risk ratios were evaluated.
Results:
Results:
Random Forest outperformed SVM and XGBoost, achieving R²=0.951, RMSE=12.8, and MAPE=6.8% for 1-week-ahead predictions with age-specific features. Performance declined for longer horizons (2-week: R²=0.893, MAPE=11.4%; 4-week: R²=0.742, MAPE=22.8%). Outpatient visits in children aged 0-2 and 5-9 were the strongest predictors. Children aged 0-2 had 37.1× higher hospitalization odds (95% CI: 35.2-39.1) than adults (15+).
Conclusions:
Conclusions:
ML models using age-specific surveillance data enable accurate 1-2-week-ahead hospitalization forecasting for enterovirus infections. This approach supports proactive resource allocation and can be adapted for other infectious diseases, advancing digital epidemiology.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.