Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Apr 4, 2022
Date Accepted: Jul 28, 2022
The Effectiveness of Supervised Machine Learning in Screening and Diagnosing Voice Disorders: A Systematic Review and Meta-Analysis
ABSTRACT
Background:
Voice screening and diagnosis are processes that are used during voice disorders investigations. Both have limited standardized tests, which are affected by the clinician’s experience and subjective judgment. Machine learning (ML) algorithms were introduced and employed in screening/diagnosing patients’ voices as an objective tool. The effectiveness of ML algorithms in assessing and diagnosing voice disorders has been investigated by numerous studies.
Objective:
This systematic review aims to assess the effectiveness of ML algorithms in screening and diagnosing voice disorders.
Methods:
An electronic search was conducted in five databases. We included studies that examined the performance (accuracy, sensitivity, and specificity) of any ML algorithms in detecting abnormal voice samples. Two reviewers independently selected the studies, extracted data from the included studies, and assessed the risk of bias in the included studies. The methodological quality of each study was assessed using the QUADAS-2 tool. Characteristics of studies, population, and index tests were extracted. Meta-analyses were conducted for pooling accuracy, sensitivity, and specificity of ML techniques. Sources of heterogeneity were addressed by excluding some studies and discussing the possible sources of it.
Results:
Out of 1409 records retrieved, 13 studies were included (participants: 4079) in this review. Thirteen machine learning techniques were used in the included studies, but the most commonly used technique was SVM. The pooled accuracy, sensitivity, and specificity of ML techniques in screening voice disorders were 93%, 96%, and 93%, respectively. LS-SVM had the highest accuracy (99%) while K-NN had the highest sensitivity (98%) and specificity (98%). Quadric Discriminant analysis (QDA) achieved the lowest accuracy (91%), sensitivity (89%), and specificity (89%).
Conclusions:
ML showed promising findings in screening voice disorders. However, the findings could not be conclusive in diagnosing voice disorders due to the limited number of studies that used ML for diagnosing purposes, thus, more investigations need to be made. Accordingly, it might not be possible to use ML as a substitution for the current diagnostic tools. Instead, it might be used as a decision support tool for clinicians to assess their patients, this could improve the management process for voice disorders assessment.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.