Trade-off Analysis of Classical and Deep Learning Models for Robust Brain Tumor Detection
ABSTRACT
Background:
Medical image analysis plays a critical role in brain tumor detection, but training deep learning models often requires large, labeled datasets, which can be time-consuming and costly. This study explores a comparative analysis of machine learning and deep learning models for brain tumor classification, focusing on whether deep learning models are necessary for small medical datasets and whether self-supervised learning can reduce annotation costs.
Objective:
The primary goal of this study is to evaluate the trade-offs between traditional machine learning, deep learning, including self-supervised learning approaches under small medical image data constraints. The secondary goal is to assess model robustness by introducing controlled image perturbations on unseen test data, to simulate real-world challenges.
Methods:
We compared four models for brain tumor image classification: (1) Support Vector Machine (SVM) with Histogram of Oriented Gradients (HOG) features, (2) a Convolutional Neural Network (CNN) based on ResNet18, (3) a Transformer-based model using Vision Transformer (ViT-B/16), (4) a Self-Supervised Learning (SSL) approach using Simple Contrastive Learning of Visual Representations (SimCLR). These models were selected to represent diverse paradigms in medical image classification. SVM+HOG represents traditional feature engineering with low computational cost, ResNet18 is a well-established CNN with strong baseline performance, ViT-B/16 uses self-attention to capture long-range spatial features, and SimCLR enables learning from unlabeled data, offering potential cost savings in annotation. This study includes 1502 training images, 429 validation images, and 215 test images. All models were trained under consistent conditions, which included data augmentation, early stopping, and multiple runs with different random seeds to account for variability. Performance was assessed using accuracy, precision, recall, F1 score, along with training time and the number of epochs required for convergence. To further evaluate model generalizability, we conducted a robust evaluation by testing the models on corrupted versions of unseen test data, such as Gaussian blur, random rotations.
Results:
The results revealed trade-offs, ResNet18 achieved the highest validation accuracy (97.44% +- 1.14%) and lowest validation loss, showing strong convergence and generalization. On the unseen test data, ResNet18 with test-time augmentation achieved 97% weighted accuracy and 99% sensitivity, correctly identifying nearly all tumor cases. SimCLR achieved (95.03% +- 0.04%) validation accuracy but required two training phases, including pretraining with Contrastive Learning and a subsequent linear evaluation phase. ViT-B/16 achieved similar accuracy (94.56% +- 0.61%) but showed signs of slightly underfitting due to its large capacity and lack of convolutional priors, which limited learning effectiveness in small image data. SVM+HOG, maintained a competitive accuracy of 93.24%. SimCLR also demonstrated real-world annotation efficiency by leveraging unlabeled data.
Conclusions:
The study reveals meaningful trade-offs between model complexity, training cost, annotation requirements, and deployment feasibility - critical factors for selecting models in real-world medical imaging applications.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.