JMIR Preprints #35406: Classifying Autism from Crowdsourced Semi-Structured Speech Recordings: A Machine Learning System

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Classifying Autism from Crowdsourced Semi-Structured Speech Recordings: A Machine Learning System

Nathan Chi;
Peter Washington;
Aaron Kline;
Arman Husic;
Cathy Hou;
Chloe He;
Kaitlyn Dunlap;
Dennis Wall

ABSTRACT

Background:

Autism spectrum disorder (ASD) is a neurodevelopmental disorder which results in altered behavior, social development, and communication patterns. In past years, autism prevalence has tripled, with 1 in 54 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process which requires the work of trained physicians, significant attention has been given to developing systems that automatically diagnose and screen for autism.

Objective:

Prosody abnormalities are among the most clear signs of autism, with affected children displaying speech idiosyncrasies (including echolalia, monotonous intonation, atypical pitch, and irregular linguistic stress patterns). In this work, we present a suite of machine learning approaches to detect autism in self-recorded speech audio captured from autistic and neurotypical (NT) children in home environments.

Methods:

We consider three methods to detect autism in child speech: first, Random Forests trained on extracted audio features (including Mel-frequency cepstral coefficients); second, convolutional neural networks (CNNs) trained on spectrograms; and third, fine-tuned wav2vec 2.0—a state-of-the-art Transformer-based speech recognition model. We train our classifiers on our novel dataset of cellphone-recorded child speech audio curated from Stanford’s Guess What? mobile game, an app designed to crowdsource videos of autistic and neurotypical children in a natural home environment.

Results:

The Random Forest classifier achieves 70% accuracy, the fine-tuned wav2vec 2.0 model achieves 77% accuracy, and the CNN achieves 79% accuracy when classifying children’s audio as either ASD or NT. We use five-fold cross-validation to evaluate model performance.

Conclusions:

Our models were able to predict autism status when training on a varied selection of home audio clips with inconsistent recording qualities, which may be more generalizable to real world conditions. The results demonstrate that machine learning methods offer promise in detecting autism automatically from speech without specialized equipment.

Citation

Please cite as:

Chi N, Washington P, Kline A, Husic A, Hou C, He C, Dunlap K, Wall D

Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study

JMIR Pediatr Parent 2022;5(2):e35406

DOI: 10.2196/35406

PMID: 35436234

PMCID: 9052034

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Pediatrics and Parenting

Date Submitted: Dec 9, 2021

Open Peer Review Period: Dec 9, 2021 - Dec 21, 2021

Date Accepted: Jan 25, 2022

Date Submitted to PubMed: Apr 18, 2022

(closed for review but you can still tweet)

Classifying Autism from Crowdsourced Semi-Structured Speech Recordings: A Machine Learning System

ABSTRACT

Citation

Copyright