Accepted for/Published in: JMIR AI
Date Submitted: Nov 13, 2025
Open Peer Review Period: Dec 2, 2025 - Jan 27, 2026
Date Accepted: Apr 8, 2026
(closed for review but you can still tweet)
Subject-Aware Model Validation for Repeated-Measures Data: A Nested Approach for Trustworthy Medical AI Applications
ABSTRACT
Background:
Repeated-measures datasets are common in biomechanics and digital health, where each participant contributes multiple correlated trials. If cross-validation (CV) ignores this structure, information can leak from training to test folds, inflating performance and undermining clinical credibility.
Objective:
To evaluate the impact of subject-aware validation strategies on model reliability in repeated-measures classification tasks, using fear of re-injury prediction post–anterior cruciate ligament reconstruction (ACLR) as a case study.
Methods:
We analyzed 623 hop trials from 72 individuals post-ACLR to classify fear of re-injury based on biomechanical features. Four cross-validation (CV) strategies were compared: stratified 10-fold CV, Leave-One-Participant-Out CV (LOPOCV), Group 3-Fold CV, and a nested framework combining LOPOCV (outer loop) with Group 3-fold CV (inner loop). Ten supervised classifiers were benchmarked across classification accuracy, train–test generalization gap, model ranking consistency, and computational efficiency.
Results:
Stratified 10-Fold CV systematically overestimated model performance (e.g., Extra Trees accuracy of 0.91 vs. 0.66 under LOPOCV) due to subject-level data leakage. Group and nested CV strategies yielded more conservative and stable estimates. The nested LOPOCV + Group CV framework achieved a good balance between generalization and participant-level independence, with reduced bias and overfitting compared to non-nested alternatives.
Conclusions:
Subject-aware validation strategies are essential for trustworthy ML evaluation in repeated-measures settings. Nested CV designs improve reproducibility, reduce selection bias, and align with regulatory expectations for clinical ML tools. These findings support best practices in model validation for biomechanics and digital health applications.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.