Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Oct 20, 2025
Date Accepted: Jan 30, 2026
Enhanced MRI-based Knee Cartilage Segmentation using the Swin-UNet Conditional Generative Adversarial Network
ABSTRACT
Background:
Accurate segmentation of bone and cartilage from magnetic resonance imaging(MRI) is crucial for the diagnosis and surgical planning of knee osteoarthritis. However, manual segmentation is time-consuming, and conventional computed tomography (CT)-based surgical systems are limited by their inability to visualize cartilage.
Objective:
This study aimed to develop a novel deep learning framework, the Swin-UNet conditional generative adversarial network (CGAN), for the automatic segmentation of knee bone and cartilage in MRI. We then evaluated its performance against the conventional UNet and UNet CGAN models
Methods:
Our dataset comprised MRI results from 232 patients. We conducted quantitative and visualization experiments on the proposed Swin-UNet CGAN model and compared the results with those of the widely used UNet and GAN-based UNet CGAN models in three categories—Femur, Tibia, and Femur + Tibia—using the Dice similarity coefficient and mean Intersection over Union (IoU).
Results:
The proposed Swin-UNet CGAN achieved Dice scores of 94.88 for femur bone+ cartilage, 94.83 for tibia bone + cartilage, and 94.06 for combined regions compared to 92.18, 92.21, and 91.64 for the UNet CGAN and 920.3, 91.85, and 91.25 for the standalone UNet, respectively. The proposed Swin-UNet CGAN achieved IoU performance of 92.33 for femur bone + cartilage, 92.18 for tibia bone + cartilage, and 91.26 for the combined regions compared to 90.48 , 90.51, and 89.58 for the U-Net CGAN and 9f0.3, 89.82, and 89.06 for the standalone U-Net, respectively. When segmenting cartilage only, all models exhibited decreased Dice scores and IoU values, but the proposed model achieved higher metrics than both the UNet and UNet CGAN models.
Conclusions:
The Swin-UNet CGAN integrates hierarchical shifted-window attention into its UNet–style transformer generator and uses a pixel-level CNN discriminator to refine outputs, allowing it to capture fine cartilage structures more effectively and achieve more accurate bone and cartilage segmentations than both the standard UNet and the UNet CGAN. This MRI-based deep learning approach addresses critical limitations of CT-based patient-specific instrumentation systems by providing cartilage visualization, potentially improving surgical precision and outcomes in total knee arthroplasty.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.