Currently submitted to: JMIR Rehabilitation and Assistive Technologies
Date Submitted: May 26, 2026
Open Peer Review Period: Jun 5, 2026 - Jul 31, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Validation of Single-Camera MediaPipe BlazePose for Knee Joint Angle Measurement: Concurrent Validity, Inter-Rater Reliability, and Exploratory Test-Retest Reliability Across Multiple Functional Tasks
ABSTRACT
Background:
Markerless pose estimation offers a cost-free approach to remote knee joint assessment, but multidimensional validation across reliability dimensions is lacking.
Objective:
To evaluate the concurrent validity, inter-rater reliability, and test-retest reliability of the MediaPipe BlazePose Heavy model for knee joint angle measurement across three functional tasks (Squat, Sit-to-Stand, and bilateral Single-Leg Stance, the latter performed on each leg).
Methods:
Fifteen healthy adults (8 male, 7 female; age 20.8 +/- 0.6 years; BMI 23.9 +/- 3.5 kg/m2) performed Squat, Sit-to-Stand, and bilateral Single-Leg Stance tasks while filmed from a lateral view. Static knee flexion was measured at four goniometer-set angles (30 degrees, 60 degrees, 90 degrees, 120 degrees). BlazePose task-specific knee angles were compared with Kinovea 2D video annotation for dynamic tasks and with goniometry for static positions using intraclass correlation coefficients (ICC[2,1]), Bland-Altman analysis, and RMSE. Four participants returned for a retest session.
Results:
Inter-rater Kinovea reliability was excellent (ICC = 0.999 [0.998, 0.999]). Dynamic concurrent validity was excellent overall (ICC = 0.982 [0.95, 0.99], MAE = 7.07 degrees) but with a systematic underestimation of -5.12 degrees; task-stratified ICCs were 0.773 [0.23, 0.91] (Squat) and 0.632 [-0.08, 0.87] (Sit-to-Stand). Static accuracy was angle-dependent (RMSE = 15.49 degrees), with overestimation at shallow flexion and underestimation at deep flexion. Single-leg stance showed poor validity (ICC = 0.136 [-0.15, 0.40]), with task-intrinsic instability supported by negative Kinovea-self retest ICCs. In an exploratory test-retest sub-sample (n = 4 participants), the BlazePose-vs-Kinovea bias was not stable across sessions (ICC = 0.094 [-0.22, 0.40]), suggesting that fixed-offset calibration should not be assumed and requires confirmation in a larger cohort.
Conclusions:
BlazePose shows promising concurrent validity for dynamic flexion-focused tasks but exhibits clearly characterized failure modes-angle-dependent static bias, SLS instability, and session-to-session bias drift-that define the boundaries of its applicability in telerehabilitation. These preliminary boundaries may inform future protocol design and screening-level telerehabilitation use, pending larger-sample and clinical-population validation. Clinical Trial: Not applicable.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.