JMIR Preprints #60520: Usefulness of Automatic Speech Recognition Assessment of Children with Speech Sound Disorders: Validation Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Usefulness of Automatic Speech Recognition Assessment of Children with Speech Sound Disorders: Validation Study

Do Hyung Kim;
Joo Won Jeong;
Dayoung Kang;
Taekyung Ahn;
Yeonjung Hong;
Younggon Im;
Jae Won Kim;
Min Jung Kim;
Dae-Hyun Jang

ABSTRACT

Background:

Speech sound disorders (SSDs) are common communication challenges in children, evaluated using standardized tools by speech language pathologists. However, traditional evaluation methods are time-consuming and subject to slight variations in reliability among testers.

Objective:

We developed and assessed the performance of an automatic speech recognition (ASR) model in detecting incorrect pronunciations among children with speech sound disorders (SSDs).

Methods:

This ASR model is an end-to-end model pretrained on a dataset comprising 436,000 hours of adult voice data spanning 128 languages. The model was additionally trained with 137 hours of speech data from typically developing children to adapt it to children’s voices and from children with articulation errors (93.6 minutes) to enhance error detection. Two standardized SSDs tests, Assessment of Phonology and Articulation for Children (APAC) and Urimal Test of Articulation and Phonology (U-TAP), were utilized, and the ASR transcriptions were compared with those by speech-language pathologists (SLPs).

Results:

This study included 30 children, aged 3–7 years, who were suspected to have speech sound disorders (SSDs). The reliability between SLPs and ASR for the percentage of consonants correct (PCC) was excellent, with an interclass correlation coefficient (ICC) of 0.984 for APAC (95% CI: .953–.994) and 0.978 for UTAP (95% CI: .941–.990). The phoneme error rates (PER) for APAC and U-TAP were 11.5% and 12.22%, respectively, reflecting discrepancies at the phoneme level between ASR and SLPs transcriptions. Regarding disagreements between the ASR and SLPs, there were an average of 2.37 and 2.7 occurrences per child for phonemes transcribed as correct pronunciations and 7.8 and 7 occurrences per child for phonemes transcribed as incorrect pronunciations by SLPs in APAC and U-TAP, respectively.

Conclusions:

This study demonstrates the effectiveness of ASR in identifying incorrect pronunciations in children with SSDs.

Citation

Please cite as:

Kim DH, Jeong JW, Kang D, Ahn T, Hong Y, Im Y, Kim JW, Kim MJ, Jang DH

Usefulness of Automatic Speech Recognition Assessment of Children With Speech Sound Disorders: Validation Study

J Med Internet Res 2025;27:e60520

DOI: 10.2196/60520

PMID: 39576242

PMCID: 11775490

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: May 15, 2024

Date Accepted: Nov 17, 2024

Date Submitted to PubMed: Nov 22, 2024

Usefulness of Automatic Speech Recognition Assessment of Children with Speech Sound Disorders: Validation Study

ABSTRACT

Citation

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: May 15, 2024

Date Accepted: Nov 17, 2024

Date Submitted to PubMed: Nov 22, 2024

Usefulness of Automatic Speech Recognition Assessment of Children with Speech Sound Disorders: Validation Study

ABSTRACT

Citation

Per the author's request the PDF is not available.