JMIR Preprints #85230: I-Speak-Tele: A Prototype Web Application Combining Automatic Intelligibility Scoring and Acoustic Feature Analysis for Dysarthric Speech

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

I-Speak-Tele: A Prototype Web Application Combining Automatic Intelligibility Scoring and Acoustic Feature Analysis for Dysarthric Speech

Pierre Vinet;
Pierre Dillenbourg;
Amelieke Slot;
Sharmila Selvanayakam;
Sandra Giovanoli;
Elisa Du;
Julia Cardoso;
Meret Branscheidt;
Chris Easthope Awai;
Christoph Michael Bauer

ABSTRACT

Background:

Dysarthria is a frequent motor speech disorder following stroke, affecting up to 42% of survivors and resulting in reduced speech intelligibility and diminished quality of life. Clinical assessments such as the Frenchay Dysarthria Assessment–2 (FDA-2) rely heavily on subjective judgment by speech-language pathologists (SLPs), which limits comparability and scalability. Telepractice solutions have the potential to extend access to care, but validated digital tools that combine automatic analysis with clinically usable interfaces remain scarce.

Objective:

This study aimed to develop and evaluate a web-based application that integrates automatic speech recognition (ASR) and acoustic analysis into a user-centered dashboard for SLPs. Specifically, we investigated: (1) whether ASR can provide intelligibility scores comparable to human listeners; (2) the usability of the system in two iterative cycles with SLPs; and (3) the feasibility of presenting clinically relevant acoustic features to support tele-rehabilitation.

Methods:

A user-centered design process was followed, involving contextual inquiry, requirements gathering, prototype development, and iterative testing with SLPs. The analytical core of the prototype included an ASR module (Whisper Large-v3) to compute intelligibility scores, combining word error rate–based accuracy with sentence- and word-level alignment. Phoneme-level error highlighting was implemented to identify frequent substitution or deletion patterns. In parallel, an acoustic module extracted clinically relevant measures, including fundamental frequency (mean and range), intensity (mean and variability), and vowel formants (F1–F2 space), supplemented by sustained phonation duration. A pilot validation compared ASR-based intelligibility scores with transcriptions from eight lay listeners for three dysarthric patients performing FDA-2 word and sentence tasks. Usability was evaluated in two cycles with eight and four SLPs, respectively, using the System Usability Scale (SUS) and structured questionnaires.

Results:

In the pilot validation, ASR performance was comparable to, and in some cases better than, untrained human listeners for mild and moderate dysarthria, though performance declined with severe cases. Both usability cycles yielded excellent SUS scores (Cycle 1 mean 88.4; Cycle 2 mean 91.7). Core workflow elements, including navigation, session upload, and intelligibility score presentation, were consistently rated highly. Feedback evolved from bug reports and requests for clearer terminology in Cycle 1 to suggestions for advanced analytic features in Cycle 2, such as additional voice-quality indices and integrated note-taking.

Conclusions:

The prototype demonstrates that automatic intelligibility scoring and acoustic analysis can be integrated into a clinically usable, web-based dashboard. While current limitations include reliance on English-only phoneme analysis, limited advanced acoustic features, and lack of regulatory compliance, the application achieved excellent usability and shows promise for scalable tele-rehabilitation. Future work should expand multilingual support, incorporate additional acoustic measures, and validate the tool in larger clinical cohorts.

Citation

Please cite as:

Vinet P, Dillenbourg P, Slot A, Selvanayakam S, Giovanoli S, Du E, Cardoso J, Branscheidt M, Easthope Awai C, Bauer CM

Automatic Speech Recognition and Acoustic Analysis for Dysarthria Assessment in Telerehabilitation: User-Centered Design and Usability Study

JMIR Form Res 2026;10:e85230

DOI: 10.2196/85230

PMID: 42397858

PMCID: 13331071

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Formative Research

Date Submitted: Oct 3, 2025

Date Accepted: May 15, 2026

I-Speak-Tele: A Prototype Web Application Combining Automatic Intelligibility Scoring and Acoustic Feature Analysis for Dysarthric Speech

ABSTRACT

Citation

Copyright