Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jun 14, 2025
Date Accepted: Sep 9, 2025
Date Submitted to PubMed: Sep 9, 2025
YOLOv12 Algorithm-Aided Detection and Classification of Lateral Malleolar Avulsion Fracture and Subfibular Ossicle Based on CT Images: A Multicenter Study
ABSTRACT
Background:
Lateral malleolar avulsion fracture (LMAF) and subfibular ossicle (SFO) are distinct entities that both present as small bone fragments near the lateral malleolus on imaging, yet require different treatment strategies. Clinical and radiological differentiation is challenging, which can impede timely and precise management. On imaging, magnetic resonance imaging (MRI) is the diagnostic gold standard for differentiating LMAF from SFO, whereas radiological differentiation on computed tomography (CT) alone is challenging in routine practice. Deep convolutional neural networks (DCNNs) have shown promise in musculoskeletal imaging diagnostics, but robust, multicenter evidence in this specific context is lacking.
Objective:
To evaluate several state-of-the-art DCNNs—including the latest YOLOv12 algorithm - for detecting and classifying LMAF and SFO on CT images, using MRI-based diagnoses as the gold standard, and to compare model performance with radiologists reading CT alone.
Methods:
In this retrospective study, 1,918 patients (LMAF: 1253, SFO: 665) were enrolled from two hospitals in China between 2014 and 2024. MRI served as the gold standard and was independently interpreted by two senior musculoskeletal radiologists. Only CT images were used for model training, validation, and testing. CT images were manually annotated with bounding boxes. The cohort was randomly split into a training set (n=1,092), internal validation set (n=476), and external test set (n=350). Four deep learning models - Faster R-CNN, SSD, RetinaNet, and YOLOv12 - were trained and evaluated using identical procedures. Model performance was assessed using mean average precision at IoU=0.5 (mAP50), area under the receiver-operating curve (AUC), accuracy, sensitivity, and specificity. The external test set was also independently interpreted by two musculoskeletal radiologists with 7 and 15 years of experience, with results compared to the best performing model. Saliency maps were generated using Shapley values to enhance interpretability.
Results:
Among the evaluated models, YOLOv12 achieved the highest detection and classification performance, with a mAP50 of 92.1% and an AUC of 0.983 on the external test set - significantly outperforming Faster R-CNN (mAP50: 63.7%, AUC: 0.79), SSD (mAP50 63.0%, AUC 0.63), and RetinaNet (mAP50: 67.0%, AUC: 0.73) (all P < .05). When using CT alone, radiologists performed at a moderate level (accuracy: 75.6%/69.1%; sensitivity: 75.0%/65.2%; specificity: 76.0%/71.1%), whereas YOLOv12 approached MRI-based reference performance (accuracy: 92.0%; sensitivity: 86.7%; specificity: 82.2%). Saliency maps corresponded well with expert-identified regions.
Conclusions:
While MRI (read by senior radiologists) is the gold standard for distinguishing LMAF from SFO, CT-based differentiation is challenging for radiologists. A CT-only DCNN (YOLOv12) achieved substantially higher performance than radiologists reading CT alone and approached the MRI-based reference standard, highlighting its potential to augment CT-based decision-making where MRI is limited or unavailable.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.