Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jul 9, 2025
Date Accepted: Mar 6, 2026
Machine Learning-Based Multidimensional Oximetry for Obstructive Sleep Apnea Screening: Development and External Validation
ABSTRACT
Background:
Obstructive sleep apnea (OSA) is a prevalent sleep disorder characterized by recurrent upper airway obstructions during sleep, leading to frequent hypoxia1. Driven by rising rates of obesity, population aging, and lifestyle changes, OSA affects nearly 1 billion individuals worldwide1. Untreated OSA is significantly associated with an increased risk of various diseases and traffic accidents2, 3. Although polysomnography (PSG) remains the gold standard for diagnosis, its high cost and operational complexity limit its utility for large-scale community screening4, 5. Existing prediction models based on questionnaires or single physiological parameters, such as STOP-BANG, exhibit insufficient sensitivity, with area under the curve (AUC) values ranging from 0.55 to 0.826, 7.Therefore, there is a critical need to develop validated and cost-effective screening protocols to facilitate detection of OSA in high-risk populations. Hypoxia is a critical pathophysiological mechanism underlying OSA3. Various oximetry parameters derived from peripheral oxygen saturation (SpO₂) during sleep serve as potential biomarkers for OSA screening. Key indicators include the oxygen desaturation index (ODI)8, minimum pulse oxygen saturation (MinSpO2)9, average pulse oxygen saturation (MeanSpO2)10, hypoxia burden (HB) etc.11, 12 Due to the ease of collecting SpO2, they have been widely adopted as a simplified alternative to PSG13. Previous studies have demonstrated that ODI is an independent predictor of OSA severity, reflecting the frequency of hypoxic events. However, it fails to capture the depth and cumulative effects of hypoxia8, 14, 15. OSA is a highly heterogeneous disease, and the predictive ability of a single oximetry parameter is often constrained by limited sensitivity and individual variability. Therefore, multi-parameter models combining several oximetry indices may offer a more robust and reliable tool for OSA screening.4, 9, 16. Additionally, integrating advanced signal processing techniques, such as approximate entropy and power spectral analysis, further enhances the accuracy of oximetry-based OSA screening17, 18. In recent years, machine learning (ML) has emerged as a powerful tool in various domains, including data mining and predictive modeling6. Advanced ML algorithms efficiently process complex classification features, capturing intricate relationships between high-risk factors and disease outcomes, thereby improving clinical prediction performance19. Among these algorithms, CatBoost has demonstrated superior performance in multiple studies, but its application in OSA screening remains underexplored 20, 21. Our study aims to integrate multiple oximetry parameters and ML algorithms to develop a simple and effective OSA prediction model. We also analyze the differences in model performance across gender and age subgroups, providing a theoretical basis for personalized OSA screening strategies.
Objective:
Obstructive sleep apnea (OSA) is a prevalent sleep disorder that poses serious health risks and imposes a substantial economic burden on society. There is an urgent need to develop more accessible and effective screening methods to identify individuals at risk of OSA.
Methods:
We recruited 3710 participants who underwent polysomnography (PSG), with 2195 subjects ultimately included in the study. Eight oximetry parameters were derived from the pulse oximetry (SpO2) signal, including oxygen saturation index (ODI), minimum oxygen saturation (MinSpO2), hypoxia burden (HB) and the percentage of total sleep time with SpO2 < 90% (ST90). These parameters were used to construct a total of 28 two-parameter models, 56 three-parameter models, and 70 four-parameter models. Six machine learning (ML) algorithms were employed for training, and model performance was evaluated using the area under the curve (AUC). Model interpretability was analyzed using Shapley Additive Explanations (SHAP).
Results:
The ST90-ODI-MinSpO₂-HB model was the best-performing model, achieving an AUC of 0.9885 and an accuracy of 0.9453. Among the six ML algorithms, CatBoost demonstrated superior performance. SHAP analysis identified ODI, HB, and MinSpO₂ as key predictors. Subgroup analysis revealed that the ODI-MinSpO₂-MeanSpO₂-HB model achieved higher predictive performance in females (AUC = 0.9924) and elderly subjects (AUC = 0.9950).
Conclusions:
The multi-parameter oximetry model using the CatBoost algorithm performed better than the single-parameter models, offering a simple and accurate tool for OSA screening. Calibration based on gender and age can further enhance its clinical utility.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.