Accepted for/Published in: JMIR AI
Date Submitted: Nov 25, 2024
Open Peer Review Period: Dec 23, 2024 - Feb 17, 2025
Date Accepted: Mar 31, 2025
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Digital Phenotyping for Detecting Depression Severity in a Large Payor-Provider System: Real-World Performance of Acoustic and Semantic Voice Analysis
ABSTRACT
Background:
There is considerable need to improve and increase the detection and measurement of depression. The use of voice as a digital biomarker of depression represents a considerable opportunity for transforming and accelerating depression identification and treatment; however, research to date has primarily consisted of small-sample feasibility or pilot studies incorporating highly controlled applications and settings. There has been limited examination of the technology in real-world use contexts.
Objective:
There is considerable need to improve and increase the detection and measurement of depression. The use of voice as a digital biomarker of depression represents a considerable opportunity for transforming and accelerating depression identification and treatment; however, research to date has primarily consisted of small-sample feasibility or pilot studies incorporating highly controlled applications and settings. There has been limited examination of the technology in real-world use contexts.
Methods:
2086 recordings of case management calls with verbally administered PHQ-9 surveys were analyzed using the ML model after the portions of the recordings with the PHQ-9 survey were manually redacted. The recordings were divided into a Development set (n=1336) and Blind set (n=671) and PHQ-8 scores were provided for the Development set for ML model refinement while PHQ-8 scores from the Blind set were withheld until after ML model depression severity output was reported.
Results:
The Development set and Blind set were well matched for age, gender and depression severity, with mean and standard deviation of age of the Development set 53.7+/- 16.3 years and the Blind set 51.7 +/- 16.9 years, biological sex of the Development set 68.1% female and the Blind set 68.8% female and mean and standard deviation of the PHQ-8 scores of the Development set 10.5 +/- 6.1 and the Blind set 10.9 +/- 6.0 respectively. The Concordance Correlation Coefficient (CCC) for the test of the ML model on the Development set was pc=0.57 and for the Blind set pc=0.54, while the MAE for the Development set was 3.91 and for the Blind set was 4.06, demonstrating strong model performance. This performance was maintained when dividing each set into subgroups of age brackets (<=39, 40-64 and >=65), biological sex, and the four categories of Social Vulnerability Index (SVI, an index based on 16 social factors) with CCCs ranging from pc=0.44-0.61. Performance at PHQ-8 threshold score cutoffs of 5, 10, 15 and 20 representing the depression severity categories of none, mild, moderate, moderately severe and severe (>=20) respectively, expressed as Receiver Operating Characteristic Curve – Area Under the Curve (ROC-AUC) values, varied between 0.79 and 0.83 in both the Development and Blind sets.
Conclusions:
Overall, the findings suggest that voice may have significant potential for detection and measurement of depression severity over a variety of ages, gender and socioeconomic categories that may enhance treatment, improve clinical decision-making, and enable truly personalized treatment recommendations.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.