Currently submitted to: JMIR Formative Research
Date Submitted: Mar 17, 2026
Open Peer Review Period: Mar 17, 2026 - May 12, 2026
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Predicting Oral Health Quality Scores Based on Decayed, Missing and Filled Teeth (DMFT) and Caries Indices Using Machine Learning Approaches
ABSTRACT
Background:
This study explores an advanced analytical approach to oral health assessment by modelling the relationship between Decayed, Missing, and Filled Teeth (DMFT), Caries Index, and Oral Health Quality Scores using Generalized Additive Models (GAM) and Multilayer Feedforward Neural Networks (MLFFNN).
Objective:
To construct and validate an analytical framework integrating nonparametric regression and neural networks for predicting Oral Health Quality Scores from DMFT and Caries indices.
Methods:
The dataset comprised 19 observations, with Qualityscore as the dependent variable and DMFT and Caries as predictors. Data preprocessing included normalization, bootstrap resampling, and division into training (60%), testing (30%), and validation (10%) sets. A GAM model evaluated the smooth contribution of DMFT and the parametric effect of Caries, while a Neural Network enhanced predictive accuracy. Model performance was assessed using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Root Mean Squared Percentage Error (RMSPE), and Median Absolute Error (MedAE).
Results:
The GAM model demonstrated strong predictive power (Adjusted R² = 0.781; Deviance explained = 81.7%). The Neural Network yielded consistent validation metrics (RMSE = MAE = MedAE = 0.347) and MSE = 0.203 on testing data, with an approximate prediction accuracy of 65.29%. Variable contribution analysis indicated DMFT and Caries contributed 37.25% and 62.75%, respectively, to the model’s predictive capacity.
Conclusions:
Integrating GAM and MLFFNN provides a robust framework for modelling complex oral health data. This hybrid approach effectively predicts Oral Health Quality Scores with minimal error, highlighting the potential of combining statistical and machine learning methods for advanced predictive analytics in dental epidemiology.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.