Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Medical Informatics

Date Submitted: Mar 24, 2026
Open Peer Review Period: Apr 13, 2026 - Jun 8, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Development and validation of machine-learning models for sarcopenia risk prediction in older adults based on evidence-driven variable selection

  • Yuyue Wu; 
  • Xinyu Ma; 
  • Jianduan Mei; 
  • Mingxuan Liu; 
  • Yan Song

ABSTRACT

Background:

Due to the high incidence of sarcopenia in the elderly and the serious adverse consequences, the existing risk prediction tools often lack systematic variable screening, and the generalizability of the model is also limited. Therefore, it is necessary to develop a more reliable risk prediction model.

Objective:

To develop and validate sarcopenia risk prediction models in older adults by integrating evidence-driven variable selection with machine learning for early screening and risk stratification.

Methods:

Extract the candidate risk factors identified through systematic meta-analysis from the the China Health and Retirement Longitudinal Study. Participants (N=2530; prevalence 15.5%) were divided into training sets and test sets in a ratio of 7:3. Use the least absolute contraction and selection olator (LASSO) regression selection predictor to train 10 machine learning models. Use cross-validation, area under the curve (AUC), Brier score, calibration degree and decision curve analysis to evaluate the performance of the model. External verification uses an independent cohort (n=191; incidence rate 16.2%). Shapley Additive Interpretation (SHAP) analyses the contribution of quantitative variables.

Results:

Elastic network, logical regression and ridge regression all showed a strong degree of differentiation in the test set, and no significant differences were observed. The calibration error at baseline is improved through model adjustment. External verification shows that under different thresholds, the model performance is stable and the net benefit is positive. Shapley’s plus interpretation analysis shows that age and body mass index are the most influential factors, while weakness, cognitive function and depressive symptoms also play an independent role.

Conclusions:

Elastic Net, Logistic Regression and Ridge Regression showed strong discrimination, calibration and clinical utility, supporting noninvasive, cost-effective early sarcopenia detection and risk stratification. Clinical Trial: PROSPERO (CRD420251083240), https://www.crd.york.ac.uk/prospero/


 Citation

Please cite as:

Wu Y, Ma X, Mei J, Liu M, Song Y

Development and validation of machine-learning models for sarcopenia risk prediction in older adults based on evidence-driven variable selection

JMIR Preprints. 24/03/2026:95975

DOI: 10.2196/preprints.95975

URL: https://preprints.jmir.org/preprint/95975

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.