Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Pediatrics and Parenting

Date Submitted: Apr 12, 2026
Open Peer Review Period: May 1, 2026 - Jun 26, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Prevalence of Myopia and Machine Learning-Based Analysis of Risk Factors among School-Aged Students in Shangrao, China: A Cross-Sectional Study

  • Qi Xu

ABSTRACT

Background:

Myopia has escalated into a critical public health crisis among children and adolescents in China. While numerous studies have explored risk factors using traditional statistical methods, there remains a challenge in handling high-dimensional behavioral data and accurately identifying the most predictive variables for precision prevention.

Objective:

This study aimed to investigate the prevalence of myopia and identify key influencing factors among primary and secondary school students in Shangrao, China, using machine learning (ML) models.

Methods:

A school-based cross-sectional study was conducted in October 2024, involving 22,359 students from grades 4 to 12. Data on demographics, visual acuity, non-cycloplegic autorefraction, and behavioral factors were collected. A multi-stage feature selection process, integrating univariate logistic regression with four ML algorithms (LASSO, RF, XGBoost, and LightGBM), was employed to identify the most predictive variables. The optimal logistic regression model was used to construct a clinical nomogram. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC), calibration curve, and decision curve analysis (DCA). Finally, model interpretability was enhanced using Shapley Additive Explanations (SHAP) to quantify the impact of each feature.

Results:

The overall prevalence of myopia was 58.51%. Prevalence increased significantly with grade level, ranging from 38.15% in upper primary school to 77.74% in senior high school. Females (63.40%) and students with a parental history of myopia (62.07% for one parent, 68.32% for both) exhibited a significantly higher prevalence (all P < 0.001). Among the ML models, the logistic regression model demonstrated the best predictive performance (AUC = 0.726, 95% CI: 0.718–0.734) and was visualized as a nomogram incorporating nine key predictors. The nomogram showed robust discriminative ability (AUC = 0.725), good calibration, and provided net clinical benefit across a wide threshold probability range (25–90%). SHAP analysis revealed that grade level, parental history of myopia, and gender were the most critical predictors. Key modifiable protective factors included maintaining a proper reading distance and spending recess outdoors.

Conclusions:

The prevalence of myopia is high among students in Shangrao. The ML-derived nomogram serves as a practical tool for risk assessment, effectively identifying both established and modifiable risk factors. Our findings support a precision prevention strategy focusing on high-risk groups while promoting behavioral interventions, such as ensuring adequate reading distance and encouraging outdoor recess.


 Citation

Please cite as:

Xu Q

Prevalence of Myopia and Machine Learning-Based Analysis of Risk Factors among School-Aged Students in Shangrao, China: A Cross-Sectional Study

JMIR Preprints. 12/04/2026:97486

DOI: 10.2196/preprints.97486

URL: https://preprints.jmir.org/preprint/97486

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.