JMIR Preprints #34474: Automatic Depression Detection of Mobile-Based Text-dependent Speech Signals Using a Deep CNN Approach: A Prospective Cohort Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Automatic Depression Detection of Mobile-Based Text-dependent Speech Signals Using a Deep CNN Approach: A Prospective Cohort Study

Ahyoung Kim;
Eun Hye Jang;
Seung-Hwan Lee;
Kwang-Yeon Choi;
Jeon Gyu Park;
Hyun-Chool Shin

ABSTRACT

Background:

In the future, automatic diagnosis of depression based on speech could complement mental health treatment methods. Previous studies have reported that acoustic properties can be used to recognize depression, including mel-frequency cepstrum coefficients (MFCCs) applied to speech recognition. However, there are few studies in which these characteristics allow differential diagnosis of patients with depressive disorder.

Objective:

This paper proposes a framework to help with automatic depression detection in a mobile environment where speech data can be easily obtained. Specifically, we recorded speech data by performing a predefined text-based speech reading task on mobile, investigated whether the recorded data can screen for depression, and proposed a deep learning-based framework that helps in automatic depression detection.

Methods:

We recruited 125 patients who met the criteria for major depressive disorder (MDD) and 113 healthy controls without current or past mental illness. Participants' voices were recorded on smart-phone while performing the task of reading predefined text-based sentences. We investigated the differences in the voice characteristics between MDD and healthy control groups using statistical analysis. We also investigated the possibility of automatic depression detection using the proposed log mel (LM) spectrogram-based deep Convolutional Neural Networks (CNN) architectures and machine learning models.

Results:

We found that there were statistically discernable differences between MDD and control groups in the MFCC features extracted through the utterances of reading predefined text-based sentences. Moreover, the best accuracies achieved with LM spectrogram-based CNN and softmax classifier on the speech data are 80.00% accuracy. Our results show that the deep-learned acoustic characteristics lead to better performances of classifiers than those using the conventional approach.

Conclusions:

In conclusion, this study suggests that the analysis of speech data recorded while reading text-dependent sentences could help predict depression status automatically by capturing characteristics of depression. Our method can contribute to an approach that allows individuals to easily and automatically assess their depressive state anytime, anywhere, without the need for experts to conduct psychological assessments on-site.

Citation

Please cite as:

Kim A, Jang EH, Lee SH, Choi KY, Park JG, Shin HC

Automatic Depression Detection Using Smartphone-Based Text-Dependent Speech Signals: Deep Convolutional Neural Network Approach

J Med Internet Res 2023;25:e34474

DOI: 10.2196/34474

PMID: 36696160

PMCID: 9909514

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Oct 25, 2021

Open Peer Review Period: Oct 25, 2021 - Dec 20, 2021

Date Accepted: Dec 18, 2022

(closed for review but you can still tweet)

Automatic Depression Detection of Mobile-Based Text-dependent Speech Signals Using a Deep CNN Approach: A Prospective Cohort Study

ABSTRACT

Citation

Copyright