Accepted for/Published in: JMIR Mental Health
Date Submitted: Sep 21, 2020
Date Accepted: Jun 3, 2021
Suicide Risk and Protective Factors in Online Support Forum Posts: Methods for Valid and Reliable Annotation
ABSTRACT
Background:
Online communities provide support for individuals looking for help with suicidal ideation and crisis. As community data is increasingly used to devise machine learning models to infer who might be at risk, limited efforts have designed well-validated annotations to better identify risk and protective factors described in online posts. These annotations can enrich and augment computational assessment approaches to identify appropriate intervention points, which are useful to public health professionals and suicide prevention researchers.
Objective:
This qualitative study aims to develop a valid and reliable annotation scheme for evaluating risk and protective factors for suicidal ideation expressed in posts in suicide crisis forums.
Methods:
We designed a valid, reliable, and clinically grounded process for identifying risk and protective markers in social media data. This scheme draws on prior work on construct validity and the social sciences of measurement. We then apply the scheme to annotate 200 posts from r/SuicideWatch, a Reddit community focused on suicide crisis.
Results:
We document our results on producing an annotation scheme that is consistent with leading public health information coding schemes for suicide and advances attention to protective factors. We show high internal validity, and we present results that indicate our approach is consistent with findings from prior work.
Conclusions:
Our work formalizes a framework that incorporates construct validity into the development of annotation schemes for suicide risk on social media. We further understanding of risk and protective factors expressed in social media data. This may help public health programming to prevent suicide as well as computational social science research and investigations that rely on quality of labels for downstream machine learning tasks.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.