Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Jan 9, 2023
Date Accepted: Jul 4, 2023

The final, peer-reviewed published version of this preprint can be found here:

Using Machine Learning of Online Expression to Explain Recovery Trajectories: Content Analytic Approach to Studying a Substance Use Disorder Forum

Yang EF, Kornfield R, Liu Y, Chih MY, Sarma P, Gustafson D, Curtin J, Shah D

Using Machine Learning of Online Expression to Explain Recovery Trajectories: Content Analytic Approach to Studying a Substance Use Disorder Forum

J Med Internet Res 2023;25:e45589

DOI: 10.2196/45589

PMID: 37606984

PMCID: 10481212

Digital Traces from a Substance Use Disorder Forum: Using Machine Learning of Online Expression to Explain Recovery Trajectories

  • Ellie Fan Yang; 
  • Rachel Kornfield; 
  • Yan Liu; 
  • Ming-Yuan Chih; 
  • Prathusha Sarma; 
  • David Gustafson; 
  • John Curtin; 
  • Dhavan Shah

ABSTRACT

Background:

Smartphone-based digital health applications (“apps”) are increasingly used to support behavior change and prevent relapse among those with substance use disorders (SUDs). These systems also collect a wealth of data from participants, including the content of messages exchanged in peer-to-peer support forums. The ways individuals self-disclose and exchange social support in these forums may provide insight into their recovery course, but manual review of a large corpus of text by human coders is inefficient.

Objective:

The present study, first, seeks to evaluate the feasibility of applying supervised machine learning to perform large-scale automated content analysis of an online peer-to-peer discussion forum. Second, we use the machine-coded data to understand, at a large scale, how communication styles relate to writers’ substance use and wellbeing outcomes at six months.

Methods:

Data were collected from a smartphone app that connects patients with SUDs to online peer support via a discussion forum. Two-hundred and sixty-eight patients over 18 years old with SUDs diagnoses were recruited by primary care providers from three Federally Qualified Healthcare Centers in the United States beginning in 2014. Two waves of survey data were collected to measure demographic characteristics and study outcomes: one at baseline (before accessing the app) and one after six months of using the app. Messages were downloaded from the peer-to-peer forum and subject to manual content analysis, identifying forms of social support and self-disclosure on the forum. These data were used to train supervised machine learning algorithms to automatically identify seven types of expression (emotional support, informational support, negative affect, change talk, insightful disclosure, gratitude, and universality disclosure). Subsequently, regression analyses examined how each expression type, represented as a proportion of a user’s total messages, was associated with recovery outcomes at six months, while controlling for these outcomes at baseline.

Results:

Over six months, 231 participants posted on the app’s support forum, of whom 216 (94%) posted at least one message in the content categories of interest. These 216 participants generated 10,503 messages over six months. We found, first, that our supervised machine learning approach allowed for large-scale content coding while retaining a high level of accuracy (average F-score of 0.86 across the content categories). Second, individuals’ expression styles were associated with recovery outcomes. For social support, a greater proportion of messages giving emotional support to peers was related to reduced substance use (Odds ratio = 0.12, p = 0.032). For self-disclosure, a greater proportion of messages expressing universality—feelings of oneness of closeness to the support group—was related to improved quality of life (β = 11.83, p = 0.038), whereas a greater proportion of negative affect expressions was negatively related to quality of life (β = -11.00, p = 0.045) and mood (β = -1.49, p = 0.007). We also found that the proportion of messages expressing emotional support and universality increased over time.

Conclusions:

This study highlights a method of computer-assisted content analysis with potential to provide real-time insights into peer-to-peer communication dynamics in online discussion contexts. Expression of emotional support, universality, and negative affect were significantly related to recovery outcomes, and attending to these dynamics may be important for appropriate and timely intervention. The increasing proportion of emotional support and universality suggests potential benefits of sustaining engagement in peer-to-peer forums.

Conclusions:

Our findings show that the expression types linked to positive recovery outcomes (i.e., emotional support giving and universality expressions) increased over time as a proportion of messages sent, suggesting the potential importance of sustaining participation and motivating more participants to interact. With the prevalence of the Internet, online peer-to-peer forums represent a growing support venue for those in recovery. This study extracted seven types of messages exchanged on a smartphone-based forum and applied supervised ML to perform large-scale quantitative content analysis over six months. Analyses leveraging the machine-coded data suggest forms of peer-to-peer communication that distinguish individuals’ likely recovery course, notably emotional support, universality, and negative affect expressions. Attending to these forms of expression may help to develop interventions that better respond to participants’ recovery needs.


 Citation

Please cite as:

Yang EF, Kornfield R, Liu Y, Chih MY, Sarma P, Gustafson D, Curtin J, Shah D

Using Machine Learning of Online Expression to Explain Recovery Trajectories: Content Analytic Approach to Studying a Substance Use Disorder Forum

J Med Internet Res 2023;25:e45589

DOI: 10.2196/45589

PMID: 37606984

PMCID: 10481212

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.