Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Apr 10, 2025
Date Accepted: Jul 30, 2025
Deep Learning Radiomics Model based on CT Image for Predicting the Classification of Osteoporotic Vertebral Fractures: Algorithm Development and Validation
ABSTRACT
Background:
Osteoporotic vertebral fractures (OVFs) are common in older adults and often lead to disability if not properly diagnosed and classified. With the increased use of CT imaging and the development of radiomics and deep learning technologies, there is potential to improve OVFs classification accuracy.
Objective:
To evaluate the efficacy of a deep learning radiomic (DLR) model, derived from CT imaging, in accurately classifying OVFs.
Methods:
The study analyzed 981 patients (aged 50–95 years; 687 females, 294 males), involving 1,098 vertebrae, from three medical centers who underwent both CT and MRI examinations. The Assessment System of Thoracolumbar Osteoporotic Fractures (ASTLOF) classified OVFs into Classes 0, 1, and 2. The data were categorized into four cohorts: training (n=750), internal validation (n=187), external validation (n=110), and prospective validation (n=51). Deep transfer learning (DTL) utilized the ResNet-50 architecture, pretrained on RadImageNet and ImageNet, to extract imaging features. DTL-based features were combined with radiomics features and refined using LASSO regression. The performance of eight machine learning classifiers for OVFs classification was assessed using ROC metrics and the 'One-vs-Rest’ approach. Performance comparisons between RadImageNet- and ImageNet-based models were performed using Delong’s test. SHAP analysis was used to interpret feature importance and the predictive rationale of the optimal fusion model.
Results:
Feature selection and fusion yielded 33 and 54 fused features for the RadImageNet- and ImageNet-based models, respectively, following pretraining on the training set. The best-performing machine learning algorithms for these two DLR models were the Multilayer Perceptron and LightGBM. The macro-average AUC values for the fused models based on RadImageNet and ImageNet were 0.934 and 0.996, respectively, with Delong's test showing no statistically significant difference (P = 2.343).The RadImageNet-based model significantly surpassed the ImageNet-based model across internal, external, and prospective validation sets, with macro-average AUCs of 0.837 vs. 0.648, 0.773 vs. 0.633, and 0.852 vs. 0.648, respectively (P < .05).Employing the binary 'One-vs-Rest' approach, the RadImageNet-based fused model achieved superior predictive performance for Class 2 (AUC=0.907, 95%CI: 0.805–0.999), with Classes 0 and 1 following (AUC/accuracy=0.829/0.803 and 0.794/0.768, respectively).SHAP analysis provided a visualization of feature importance in the RadImageNet-based fused model, highlighting the top three most influential features: Cluster Shade, Mean, and Large Area Low Gray Level Emphasis, and their respective impacts on predictions.
Conclusions:
The RadImageNet-based fused model using CT imaging data exhibited superior predictive performance compared to the ImageNet-based model, demonstrating significant utility in OVFs classification and aiding clinical decision-making for treatment planning. Among the three classes, the model performed best in identifying Class 2, followed by Class 0 and Class 1.
Citation
The author of this paper has made a PDF available, but requires the user to login, or create an account.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.