Deep Learning for Age Estimation and Sex Prediction Using Mandibular-Cropped Cephalometric Images: Comparative Backbones and Data-Balancing Scenarios
ABSTRACT
Background:
DMandibular structures offer resilient features for forensic identification where partial remains are available in postmortem condition. Deep learning applied to cephalometric radiographs offers an opportunity to predict demographic attributes such as age and sex, which are critical in forensic and clinical context.
Objective:
This study aimed to develop and evaluate a multi-task deep learning framework for age regression and sex classification from cropped mandibular regions of cephalometric radiographs, comparing multiple CNN backbones and preprocessing scenarios to address class imbalance.
Methods:
A total of 340 anonymized cephalometric radiographs from Indonesian individuals (aged 8–40 years) were cropped into mandibular angle and mandibular length regions, resulting in 680 samples validated by dentists with ≥5 years of experience. Images were resized (224×224 pixels), deduplicated, and preprocessed under four scenarios: Original, SMOTE, StandardScaler, and SMOScale. Augmentation included random rotation (≤90°), zoom (≤25%), flips, shifts, shear, and brightness variation. Six pre-trained CNN backbones (MobileNetV2, ResNet50V2, InceptionV3, InceptionResNetV2, VGG16, VGG19) were fine-tuned using a multi-task architecture with shared feature extraction and dual output heads (sex classification and age regression). Models were trained with Adam optimizer (lr=1e-4), Huber loss (age), binary cross-entropy (sex), dropout=0.5, early stopping (patience=10), and learning rate scheduling. Evaluation metrics included F1-score (sex) and MAE/MAPE (age) with 95% confidence intervals (bootstrapping).
Results:
VGG16 achieved the best overall performance. For age regression, it reached an MAE of 3.19 years (95% CI: x–y) and MAPE of 13.18% on the Original dataset. For sex classification, VGG16 achieved an F1-score of 86% (95% CI: x–y) with StandardScaler preprocessing, while ResNet50V2 showed the weakest performance (max F1-score: 76%). SMOTE and SMOScale improved MobileNetV2 performance, but InceptionV3 and ResNet50V2 remained limited in male classification.
Conclusions:
This study demonstrates that combining mandibular cropping with deep learning and balanced preprocessing scenarios enhances demographic prediction in cephalometric radiographs. The findings highlight the potential of AI-assisted forensic odontology to support disaster victim identification when partial remains are available. Clinical Trial: Not applicable
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.