Accepted for/Published in: JMIR Bioinformatics and Biotechnology
Date Submitted: Oct 14, 2025
Date Accepted: Mar 13, 2026
Temporal Reproducibility of a Genetic Algorithm–Derived Health Risk Score: A Standardized Out-of-Fold Validation Framework (2021–2023)
ABSTRACT
Background:
The need for corporate health management and preventive strategies is increasing, driven by population aging and a more diverse workforce. Conventional threshold-based evaluations are uniform and limited in personalization and predictability. To address this gap, Genetic Algorithm (GA)-based scoring has been proposed as a data-driven approach for health risk stratification [1,2], while Bayesian estimation (Bayes) provides a probabilistic framework for integrating lifestyle behaviors with clinical indicators. Concerns regarding reproducibility, however, persist for real-world health checkup data, where model stability and generalization remain underexplored, underscoring the need for standardized evaluation across multiple cohorts.
Objective:
The objective of this study was to validate the reproducibility, stability, and generalizability of a GA–Bayes integrated health risk score using health checkup data from three consecutive years (2021–2023) under harmonized methodological settings.
Methods:
Anonymized health checkup data from 2021 (n=3,744), 2022 (n=5,153), and 2023 (n=5,352) were analyzed. The study incorporated a total thirteen clinical indicators, encompassing body mass index (BMI), blood pressure, triglycerides, high density lipoprotein (HDL) cholesterol, low density lipoprotein (LDL), aspartate aminotransferase (AST), alanine aminotransferase (ALT), gamma-glutamyl transferase (GGT, γ-GTP), fasting plasma glucose (FPG), hemoglobin A1c (HbA1c), and other associated metabolic markers. In addition, ten lifestyle questionnaire items were included, such as smoking, alcohol consumption, diet and exercise. Missing data were handled by imputing median values for continuous variables. GA optimization was implemented with seed control, stratified repeated cross-validation, Platt scaling [5], and early stopping, followed by analysis using logistic regression [3] on GA-optimized scores to generate receiver operating characteristic (ROC) curves and calculate cross-validated area under the curve (AUC) values, complemented by Bayesian estimation.
Results:
The model demonstrated stable discriminative performance across three years cohorts, with both full-sample and cross-validated AUC values consistently exceeding 0.90 (0.925, 0.922, and 0.915, for 2021, 2022, and 2023, respectively). Consistent AUCs across cohorts confirmed reproducibility of model discrimination under standardized evaluation procedures, supporting the methodological rigor of the GA-Bayes framework.
Conclusions:
The GA-Bayes integrated score demonstrated robust reproducibility, stability, and generalizability through standardized analytical procedures across three consecutive health checkup cohorts (2021–2023). By integrating Bayesian estimation with GA optimization, this combined framework enhances methodological rigor and clinical applicability, demonstrating reproducible performance across three consecutive years. These findings support the feasibility of deploying GA-Bayes-based risk scoring in preventive medicine workflows and highlight the importance of standardized evaluation protocols for reliable implementation and broader practical adoption.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.