Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Infodemiology

Date Submitted: Apr 4, 2025
Open Peer Review Period: Apr 16, 2025 - Jun 11, 2025
Date Accepted: Jul 20, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Data Mining Trauma: AI-Assisted Qualitative Study of Cyber Victimization on Reddit

Antisdel J, Miller WR, Groves D

Data Mining Trauma: AI-Assisted Qualitative Study of Cyber Victimization on Reddit

JMIR Infodemiology 2025;5:e75493

DOI: 10.2196/75493

PMID: 40902086

PMCID: 12407219

Data Mining Trauma: An AI-Assisted Qualitative Study of Cyber Victimization on Reddit

  • J'Andra Antisdel; 
  • Wendy R Miller; 
  • Doyle Groves

ABSTRACT

Background:

Cyber victimization exposes teens to numerous risks. Their developmental stage often leaves them unaware of potential dangers, making them susceptible to psychological distress. Despite this vulnerability, methods for identifying teens at risk of cyber victimization within healthcare settings are limited, as is research that explores their experiences of cyber victimization. The purpose of this study was to analyze how teens describe experiences of cyber victimization on the social media platform Reddit using data mining.

Objective:

This study aimed to analyze and describe how teens on Reddit describe and discuss their experience of cyber victimization using data mining and computational analysis of unsolicited data.

Methods:

This computational qualitative study used data mining, Word Adjacency Graph (WAG) Modeling, and thematic analysis to analyze discussions of Reddit users surrounding cyber victimization. Inclusion criteria included posts from 2012-2023 from subreddits r/cyberbullying and r/bullying. GPT-4, an advanced artificial intelligence language model, summarized posts and assisted in cluster labeling. Posts were reviewed to remove irrelevant content and duplicates. User anonymity was maintained throughout the study.

Results:

13,381 posts from 3,283 Reddit were analyzed, with 5.07% originating between 2012 and 2018 and 94.93% from 2019 to 2023. The WAG modeling approach identified 38 clusters, with 35 deemed to be relevant to cyber victimization experiences. Two clusters containing irrelevant material were excluded. Six overarching themes emerged: (1) psychological impact, (2) coping and healing, (3) protecting yourself online, (4) protecting yourself offline, (5) victimization across various settings, and (6) seeking meaning and understanding.

Conclusions:

The study highlights the effectiveness of data mining and AI in analyzing large public data sets for qualitative research. These methods can inform future studies on risky internet behavior, victimization, and assessment strategies in healthcare settings.


 Citation

Please cite as:

Antisdel J, Miller WR, Groves D

Data Mining Trauma: AI-Assisted Qualitative Study of Cyber Victimization on Reddit

JMIR Infodemiology 2025;5:e75493

DOI: 10.2196/75493

PMID: 40902086

PMCID: 12407219

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.