Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Jun 27, 2024
Date Accepted: Dec 2, 2024

The final, peer-reviewed published version of this preprint can be found here:

Predicting Satisfaction With Chat-Counseling at a 24/7 Chat Hotline for the Youth: Natural Language Processing Study

Hornstein S, Lueken U, Wundrack R, Hilbert K

Predicting Satisfaction With Chat-Counseling at a 24/7 Chat Hotline for the Youth: Natural Language Processing Study

JMIR AI 2025;4:e63701

DOI: 10.2196/63701

PMID: 39965198

Predicting Satisfaction with Chat-Counseling at a 24/7 Chat Hotline for the Youth: A Natural Language Processing Study

  • Silvan Hornstein; 
  • Ulrike Lueken; 
  • Richard Wundrack; 
  • Kevin Hilbert

ABSTRACT

Background:

Chat-based counseling services are popular for the low-threshold provision of mental health support to the youth. Also, they are particularly suitable for the utilization of Natural Language Processing (NLP) for an improved provision of care.

Objective:

Consequently, this paper evaluates the feasibility of such a use case, namely the NLP-based automated evaluation of satisfaction with the chat interaction. This preregistered (OSF: SR4Q9) approach could be utilized for evaluation and quality control procedures, as being particularly relevant for those services.

Methods:

The consultations of 2,609 young chatters (around 140,000 messages) and corresponding feedback were used to train and evaluate classifiers to predict whether a chat was perceived as helpful or not. On the one hand, we trained a word-vectorizer in combination with a XGBoost classifier, applying cross-validation and extensive hyperparameter tuning. On the other hand, we trained several transformer-based models, comparing model-types, preprocessing and over- and under sampling techniques. For both model types, we selected the best performing approach on the training set for a final performance evaluation on the 522 users in the final test set.

Results:

The fine-tuned XGBoost classifier achieved an AUROC score of 0.67 (P <.0 01) on the previously unseen test set. The selected longformer-based model did not outperform this baseline, scoring 0.67 as well (P = .92). A SHAP explainability approach suggested that help seekers rating a consultation as helpful commonly expressed their satisfaction already within the conversation. In contrast, the rejection of offered exercises predicted perceived unhelpfulness.

Conclusions:

Chat conversations include relevant information regarding the perceived quality of an interaction that can be utilized by NLP-based prediction approaches. However, to determine if the moderate predictive performance translates into meaningful service improvements requires randomized trials. Further, our results highlight the relevance of contrasting pretrained models with simpler baselines to avoid the implementation of unnecessarily complex models.


 Citation

Please cite as:

Hornstein S, Lueken U, Wundrack R, Hilbert K

Predicting Satisfaction With Chat-Counseling at a 24/7 Chat Hotline for the Youth: Natural Language Processing Study

JMIR AI 2025;4:e63701

DOI: 10.2196/63701

PMID: 39965198

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.