Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Oct 25, 2020
Date Accepted: Apr 25, 2021
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Artificial Intelligence Application for Vocal Fold Disease Prediction Through Voice Recognition: Development and Usability Study
ABSTRACT
Background:
Dysphonia influences the quality of life by interfering with communication. However, laryngoscopic examination is expensive and not readily accessible in primary care units. Experienced laryngologists are required to achieve an accurate diagnosis.
Objective:
This study sought to detect various vocal fold diseases through pathological voice recognition using artificial intelligence.
Methods:
We collected 29 normal voice samples and 527 samples of individuals with voice disorders, including vocal atrophy (n=210), unilateral vocal paralysis (n=43), organic vocal fold lesions (n=244), and adductor spasmodic dysphonia (n=30). The 556 samples were divided into two sets: 440 samples as the training set and 116 samples as the testing set. A convolutional neural network approach was applied to train the model and findings were compared with human specialists.
Results:
The convolutional neural network model achieved a sensitivity of 0.70, a specificity of 0.90, and an overall accuracy of 65.5% for distinguishing normal voice, vocal atrophy, unilateral vocal paralysis, organic vocal fold lesions, and adductor spasmodic dysphonia. Compared to human specialists, the overall accuracy was 58.6% and 49.1% for the two laryngologists, and 38.8% and 34.5% for the two general ear, nose, and throat doctors.
Conclusions:
We developed an artificial intelligence-based screening tool for common vocal fold diseases, which possessed high specificity after training with our Mandarin pathological voice database. This approach has clinical potential to use artificial intelligence for general vocal fold disease screening via voice and includes a quick survey during a general health examination. It can be applied in telemedicine for areas that lack laryngoscopic abilities in primary care units.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.