Accepted for/Published in: JMIR Formative Research
Date Submitted: Apr 14, 2025
Date Accepted: Nov 12, 2025
Predicting UHR Outcomes Using Linguistic and Acoustic Measures from HiSoC Recordings: A mHealth Longitudinal Cohort Exploratory Study
ABSTRACT
Background:
Early detection of individuals at ultra-high risk (UHR) for psychosis is essential for timely intervention and improved clinical outcomes. However, current UHR assessments, which rely heavily on psychometric tools, often suffer from low specificity. Recent advancements in research suggest that machine learning (ML) can enhance these assessments, particularly through the integration of linguistic and acoustic features.
Objective:
In this study, we investigated the potential of audio recordings from the High-Risk Social Challenge (HiSoC) task in the development of UHR outcome prediction models.
Methods:
Methods:
Audio recordings of HiSoC task responses were collected from 41 UHR participants (12 converters, 15 remitters, and 14 maintainers) enrolled in the Longitudinal Youth at Risk (LYRIK) study. Responses from the conversion group were obtained within 12 months of psychosis onset, while responses from the remit and maintain groups were collected at baseline. Linguistic features analyzed included Words per Minute (WPM), Articulation Rate (AR), Disfluencies (DF), and Sequential Coherence (SC). Acoustic features comprised mean and standard deviation of fundamental frequency, mean and standard deviation of intensity, and HF500. To investigate differences in linguistic and acoustic features across outcome groups, multivariate regression analysis was performed. Additionally, a Linear Support Vector Machine (SVM) with nested cross-validation was employed to estimate the generalizability error of the predictive models. Model performance was evaluated using balanced accuracy (BA) as the primary metric.
Results:
The conversion outcome group exhibited lower WPM (adj.P = .024) and higher DF (adj.P = .004) compared to the remission outcome group . No significant differences were found in AR, SC or acoustic measures across outcome groups. The model built on acoustic features performed the best in predicting conversion (BA=0.595, 95% CI [0.287, 0.738]). The best performance in predicting remission was achieved by the model combining linguistic and acoustic features (BA=0.851, 95% CI [0.500, 0.920]).
Conclusions:
Linguistic and acoustic features extracted from HiSoC task responses can distinguish between UHR individuals with varying clinical outcomes. Future advancements in automated transcription technology could enable the complete automation of this workflow, paving the way for a scalable supplementary screening tool to complement existing psychometric assessments.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.