Accepted for/Published in: JMIR Formative Research
Date Submitted: Feb 4, 2022
Date Accepted: May 9, 2022
Evaluation of a Digital Content-free Speech Analysis Tool to Measure Affective Distress in Mental Health
ABSTRACT
Background:
Mental illnesses are a significant problem worldwide. Mood disorders and depression are pervasive. These represent severe health and emotional impairment for the individual and a considerable economic and social burden. On the one hand, fast and reliable diagnosis and appropriate treatment and care are therefore of great importance. The initial diagnosis and follow-up, especially in rural areas, must be carried out by physicians who do not have much psychiatric experience. Verbal communication can make the speaker’s mental state clear - regardless of the content, also via speech melody, intonation, etc. Both in everyday life and under clinical conditions, a listener with the appropriate previous knowledge or specialist training can grasp helpful knowledge about the speaker's psychological state. However, the presence of experienced therapists and the necessary time is often not available, leaving new opportunities to capture linguistic, noncontentual information. To improve the care of patients with depression, we have done a proof-of-concept study with a specialized tool for assessing their most critical cognitive parameters through a non-consensual analysis of their active speech. Using speech analysis for assessment and tracking mental health patients opens up the possibility of remote, automatic, and ongoing evaluation, when used with patients‘ smartphones, as part of the current trends towards the increasing use of digital and mobile health tools.
Objective:
The primary aim of this study is to evaluate the measurements of the presence or absence of a depressive mood of the participants by comparing the analysis of noncontentual speech parameters to the results of the Patient Health Questionnaire [PHQ].
Methods:
Proof-of-concept study including participants in different affective phases (with and without depression). Inclusion criteria include a neurological or psychiatric diagnosis made by a specialist and fluent use of the German language. Exclusion criteria include diagnoses like psychosis, dementia, speech or language disorders in neurological diseases, addiction history, a suicide attempt recently or in the last 12 months, or insufficient language skills. The measuring instrument will be the VoiceSense digital voice analysis tool, which enables the analysis of 200 specific speech parameters and the assessment of the findings using psychometric instruments and questionnaires (PHQ-9).
Results:
292 psychiatric and voice assessments were done with 163 participants (47 males) aged 15-82 years. Eighty-seven participants were not depressed at assessment time, clinically mild to moderate depressive phases were identified in 88 participants at the assessment time. Ninety-eight participants showed subsyndromal Symptoms, but 19 participants were severely depressed. In the speech analysis, a clear differentiation between the individual depressive levels, as seen in the PHQ-9, was also shown, especially the clear differentiation between non-depressed and depressed participants. The study shows a Pearson correlation of 0.41 between clinical assessment and non-contentual speech analysis (p<0.0001).
Conclusions:
The use of speech recognition shows a high level of accuracy, not only in terms of the general recognition of a clinically relevant depressive state in the subjects. Instead, there is also a high degree of agreement about the extent of the depressive impairment with the assessment of the experienced clinical practitioners. From our point of view, the application of the non-contentual analysis system in everyday clinical practice makes sense, especially with the idea of a quick and unproblematic assessment of the state of mind, which can even be carried out without contact Clinical Trial: Clinicaltrials.gov NCT03700008
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.