JMIR Preprints #33460: Evaluating Web-Based Automatic Transcription for Alzheimer’s Speech Data: Transcript Comparison and Machine Learning Analysis

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Evaluating Web-Based Automatic Transcription for Alzheimer’s Speech Data: Transcript Comparison and Machine Learning Analysis

Thomas Soroski;
Thiago da Cunha Vasco;
Sally Newton-Mason;
Saffrin Granby;
Caitlin Lewis;
Anuj Harisinghani;
Matteo Rizzo;
Cristina Conati;
Gabriel Murray;
Giuseppe Carenini;
Thalia Shoshana Field;
Hyeju Jang

ABSTRACT

Background:

Speech data for medical research can be collected non-invasively and in large volumes. Speech analysis has shown promise in diagnosing neurodegenerative disease. To effectively leverage speech data, transcription is important as there is valuable information contained in lexical content. Manual transcription, while highly accurate, limits potential scalability and cost savings associated with language-based screening.

Objective:

To better understand the use of automatic transcription for classification of neurodegenerative disease (Alzheimer’s Disease [AD], mild cognitive impairment [MCI] or subjective memory complaints [SMC] versus healthy controls), we compared automatically generated transcripts against transcripts that went through manual correction.

Methods:

We recruited individuals from a memory clinic (“patients”) with a diagnosis of mild-moderate AD, (n=44), MCI (n=20), SMC (n=8) and healthy controls living in the community (n=77). Participants were asked to describe a standardized picture, read a paragraph, and recall a pleasant life experience. We compared transcripts generated using Google speech-to-text software to manually-verified transcripts by examining transcription confidence scores, transcription error rates, and machine learning classification accuracy. For the classification tasks, Logistic Regression, Gaussian Naive Bayes, and Random Forests were used.

Results:

The transcription software showed higher confidence scores (P<.001) and lower error rates (P>.05) for speech from healthy controls as compared with patients. Classification models using human-verified transcripts significantly (P<.001) outperformed automatically-generated transcript models for both spontaneous speech tasks. This comparison showed no difference in the reading task. Manually adding pauses to transcripts had no impact on classification performance. Manually correcting both spontaneous speech tasks led to significantly higher performances in the machine learning models.

Conclusions:

We found that automatically-transcribed speech data could be used to distinguish patients with a diagnosis of AD, MCI or SMC from controls. We recommend a human verification step to improve the performance of automatic transcripts, especially for spontaneous tasks. Moreover, human verification can focus on correcting errors and adding punctuation to transcripts. Manual addition of pauses, however, is not needed, which can simplify the human verification step to more efficiently process large volumes of speech data.

Citation

Please cite as:

Soroski T, da Cunha Vasco T, Newton-Mason S, Granby S, Lewis C, Harisinghani A, Rizzo M, Conati C, Murray G, Carenini G, Field TS, Jang H

Evaluating Web-Based Automatic Transcription for Alzheimer Speech Data: Transcript Comparison and Machine Learning Analysis

JMIR Aging 2022;5(3):e33460

DOI: 10.2196/33460

PMID: 36129754

PMCID: 9536526

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Aging

Date Submitted: Sep 8, 2021

Date Accepted: Jul 23, 2022

Evaluating Web-Based Automatic Transcription for Alzheimer’s Speech Data: Transcript Comparison and Machine Learning Analysis

ABSTRACT

Citation

Copyright