Accepted for/Published in: JMIR Public Health and Surveillance
Date Submitted: Jul 1, 2024
Open Peer Review Period: Jul 2, 2024 - Aug 27, 2024
Date Accepted: Nov 7, 2024
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Predicting Suicidality in Youth Seeking Help from a Crisis Text Line: Development and Validation of an Explainable Transformer-Based Artificial Intelligence (AI) Text Classifier
ABSTRACT
Background:
Background:
Suicide is an important and pressing avenue for public health and Machine Learning (ML) models can be used to help identify individuals at risk. Leveraging transfer-learning from pre-trained language models (LMs) to predict suicidal ideation and behaviors in speech and text is promising, according to studies using benchmark datasets and realworld social media data.
Objective:
Objective:
We set out to i) develop and apply ML methods in predicting suicidal ideation and behaviors in a real-world crisis-helpline dataset, using transformer-based pretrained models as a building block ii) evaluate, cross-validate, and benchmark the model against traditional text classification approaches, and iii) train an explainer model, informing about relevant risk-associated features.
Methods:
Methods:
We used chat protocols from youth, aged 14 to 25, seeking help from a German crisis helpline, to train a machine learning (ML) model, utilizing a transformer-based language model architecture with pre-trained weights combined with Long-Short-Term-Memory-Layers. We predicted Suicidal Ideation (SI) and Advanced Suicidal Engagement(ASE), indicated by composite Columbia-Suicide-Severity-Rating Scale (C-SSRS) scores, and compared predictions against those of a classical word-vector based ML model. We then obtained discrimination, calibration, clinical utility and explainability information using a Shapley value-based post-hoc estimation (SHAP) model.
Results:
Results:
Based on data from 1,348 help-seeking encounters, the transformer-based classifier yielded a macro-averaged area under the curve (AUC) of 0.93 (95% CI [0.87, 0.99]) and a macro-averaged F1 score of 0.79 (95% confidence interval [CI] [0.60, 0.96]). It outperformed the word-vector-based baseline model (AUC = 0.77; 95% CI [0.63, 0.89]; F1 score = 0.56; 95% CI [0.0, 0.65]). The SHAP model highlighted language features like 'I-talk,' phrases indicating low self-esteem and self-hatred, lethal means, hopelessness, and body issues as predictive of suicidal ideation and behaviors.
Conclusions:
Conclusions:
Neural Networks, using LM-based transfer learning, can effectively identify suicidal ideation and advanced suicidal engagement. The explainer model additionally revealed language features associated with respective suicidal phenomena. Such models may potentially support clinical decision-making in the context of suicide prevention services.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.