JMIR Preprints #25247: Artificial Intelligence Application for Vocal Fold Disease Prediction Through Voice Recognition: Development and Usability Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Artificial Intelligence Application for Vocal Fold Disease Prediction Through Voice Recognition: Development and Usability Study

Hao-Chun Hu;
Shyue-Yih Chang;
Chuen-Heng Wang;
Kai-Jun Li;
Hsiao-Yun Cho;
Yi-Ting Chen;
Chang-Jung Lu;
Tzu-Pei Tsai;
Oscar Kuang-Sheng Lee

ABSTRACT

Background:

Dysphonia influences the quality of life by interfering with communication. However, laryngoscopic examination is expensive and not readily accessible in primary care units. Experienced laryngologists are required to achieve an accurate diagnosis.

Objective:

This study sought to detect various vocal fold diseases through pathological voice recognition using artificial intelligence.

Methods:

We collected 29 normal voice samples and 527 samples of individuals with voice disorders, including vocal atrophy (n=210), unilateral vocal paralysis (n=43), organic vocal fold lesions (n=244), and adductor spasmodic dysphonia (n=30). The 556 samples were divided into two sets: 440 samples as the training set and 116 samples as the testing set. A convolutional neural network approach was applied to train the model and findings were compared with human specialists.

Results:

The convolutional neural network model achieved a sensitivity of 0.70, a specificity of 0.90, and an overall accuracy of 65.5% for distinguishing normal voice, vocal atrophy, unilateral vocal paralysis, organic vocal fold lesions, and adductor spasmodic dysphonia. Compared to human specialists, the overall accuracy was 58.6% and 49.1% for the two laryngologists, and 38.8% and 34.5% for the two general ear, nose, and throat doctors.

Conclusions:

We developed an artificial intelligence-based screening tool for common vocal fold diseases, which possessed high specificity after training with our Mandarin pathological voice database. This approach has clinical potential to use artificial intelligence for general vocal fold disease screening via voice and includes a quick survey during a general health examination. It can be applied in telemedicine for areas that lack laryngoscopic abilities in primary care units.

Citation

Please cite as:

Hu HC, Chang SY, Wang CH, Li KJ, Cho HY, Chen YT, Lu CJ, Tsai TP, Lee OKS

Deep Learning Application for Vocal Fold Disease Prediction Through Voice Recognition: Preliminary Development Study

J Med Internet Res 2021;23(6):e25247

DOI: 10.2196/25247

PMID: 34100770

PMCID: 8241431

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Oct 25, 2020

Date Accepted: Apr 25, 2021

Artificial Intelligence Application for Vocal Fold Disease Prediction Through Voice Recognition: Development and Usability Study

ABSTRACT

Citation

Copyright