Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Public Health and Surveillance

Date Submitted: Oct 25, 2025
Date Accepted: May 30, 2026

The final, peer-reviewed published version of this preprint can be found here:

Predicting Tuberculosis Outcomes Using Routine Surveillance Data in Chiang Mai, Thailand: Retrospective Cohort Study

Saksaen P, Boonchieng E, Thongprachum A, Maotheuak S, Chautrakarn S, Boonchieng W

Predicting Tuberculosis Outcomes Using Routine Surveillance Data in Chiang Mai, Thailand: Retrospective Cohort Study

JMIR Public Health Surveill 2026;12:e86495

DOI: 10.2196/86495

PMID: 42348883

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Integrating SEIR Modeling and Machine Learning to Enhance Tuberculosis Surveillance: A Retrospective Cohort Study in Chiang Mai, Thailand

  • Porramat Saksaen; 
  • Ekkarat Boonchieng; 
  • Aksara Thongprachum; 
  • Surasak Maotheuak; 
  • Sineenart Chautrakarn; 
  • Waraporn Boonchieng

ABSTRACT

Background:

The most common cause of mortality due to infectious diseases worldwide is Tuberculosis (TB). As of 2023, approximately 10.8 million new cases of TB have been diagnosed. Importantly, the 8.2 million cases recorded regionally have established that Thailand is a high-burden country for both TB and TB/HIV with an estimated 113,000 TB new cases recorded each year. Accordingly, these incidences have been associated with a treatment coverage rate of only 71% [1]. In Chiang Mai Province, there are still high disparities with respect to early detection, especially among rural and remote districts where the use of innovative surveillance models that address equity concerns is required.

Objective:

This proposed research is an attempt to create and test hybrid surveillance system applications using the SEIR (Susceptible-Exposed-Infectious-Recovered) epidemiological model alongside machine learning (ML) algorithms to enhance TB risk forecast (prevention) and aid informed decision-making in Chiang Mai, Thailand.

Methods:

The study method we employed was that of a retrospective cohort study involving data mining that utilized data on 5,557 known cases of TB cases registered in the National Tuberculosis Information Program (NTIP), 2020-2024. A hybrid SEIR-ML model was yielded, which matched individual algorithms to each stage of the disease; progression: logistic regression (risk of infection), progression: random forest, Cox proportional hazards model (mortality): Cox proportional hazard, and accelerated failure time model (treatment delay): treatment delay. The area under the receiver was operated as a characteristic curve (AUC), wherein the C-index, as provided by Harrell and R2, was used as a measurement of model performance. Simulations (scenario) were conducted to determine the possible impacts of model implementation on the system, while also monitoring any possible implementation problems.

Results:

Integrated models revealed high predictive labels in all dimensions of AUC at 0.89 (95% CI: 0.87-0.91) with regard to infection and 0.91 (95% CI: 0.89-0.93) with regard to progression; C-index 0.86 (95% CI: 0.84-0.88) with regard to mortality, and R2 = 0.74 with regard to treatment delay (all p <.001). The HIV co-infection (HR = 5.8) and the HIV concurrent at ages above 65 years (HR = 12.3) were greatly associated with a risk of mortality. Additionally, rural residence, older age, and health insurance were significantly correlated with treatment delays (mean delay of treatment: 12-18 days). The projected outcomes indicated a 25% early detection increase, while 15% better treatment results and 20% decreased mortality rates were demonstrated as results of implementing the proposed framework through the use of scenario modeling.

Conclusions:

A combination of mechanistic SEIR modeling and danger forecasting, which was achieved through machine learning, enhanced TB surveillance through population-scale and individual-scale forecasting. The framework also detected structural imbalances in healthcare access and could be used as a scalable and decision-supporting form of control for TB in resource-limited settings, resulting in a more equity-focused solution.


 Citation

Please cite as:

Saksaen P, Boonchieng E, Thongprachum A, Maotheuak S, Chautrakarn S, Boonchieng W

Predicting Tuberculosis Outcomes Using Routine Surveillance Data in Chiang Mai, Thailand: Retrospective Cohort Study

JMIR Public Health Surveill 2026;12:e86495

DOI: 10.2196/86495

PMID: 42348883

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.