Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Feb 21, 2020
Date Accepted: Jun 4, 2020

The final, peer-reviewed published version of this preprint can be found here:

Social Media Text Mining Framework for Drug Abuse: Development and Validation Study With an Opioid Crisis Case Analysis

Nasralah T, El-Gayar O, Wang Y

Social Media Text Mining Framework for Drug Abuse: Development and Validation Study With an Opioid Crisis Case Analysis

J Med Internet Res 2020;22(8):e18350

DOI: 10.2196/18350

PMID: 32788147

PMCID: 7446758

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Social Media Text Mining Framework for Drug Abuse: An Opioid Crisis Case Analysis

  • Tareq Nasralah; 
  • Omar El-Gayar; 
  • Yong Wang

ABSTRACT

Background:

Social media is considered as a promising and viable source of data for gaining insights into various disease conditions, patients’ attitudes, behaviors, and medications. Social media can be used to recognize communication and behavioral themes of problematic use of prescription drugs. Mining and analyzing social media data have challenges and limitations with respect to topic deduction and data quality. As a result, there is a need for a structured approach to analyze social media content related to drug abuse in a manner that can mitigate the challenges surrounding the use of such data source.

Objective:

The objective of this research was to develop and evaluate a framework for mining and analyzing social media content related to drug abuse. The framework - which consists of four phases, namely, topic discovery and detection; data collection; data preparation and quality; and analysis and results - is designed to mitigate challenges and limitations related to topic deduction and data quality in social media data analytics for drug abuse.

Methods:

We developed a social media text mining framework for drug abuse. The framework consists of four phases. In the discovery and topic detection phase, we defined different terms that relate to keywords, categories, and characteristics according to the topic of interest and the objective of monitoring. In the data collection phase, we used Crimson Hexagon to collect data using a search query that is informed by a drug abuse ontology. In the data preparation and quality phase, we prepared the data for the data analysis task and evaluated the quality of the data using a proposed evaluation matrix. Finally, in the analysis and results phase, we choose the suitable data analysis approach to analyze the collected data. The framework was evaluated using the opioid epidemic as our drug abuse case analysis.

Results:

We developed and validated a social media text mining framework for drug abuse. We demonstrated the applicability of the proposed framework to identify public concerns toward the opioid epidemic and the most discussed topics on social media that relate to opioids. Using the framework, our data analysis identified existing and new discussion topics related to the opioid drug epidemic.

Conclusions:

The proposed framework addressed challenges related to topic detection and data quality. We demonstrated the applicability of the proposed framework to identify the common concerns toward the opioid epidemic and the most discussed topics on social media related to opioids.


 Citation

Please cite as:

Nasralah T, El-Gayar O, Wang Y

Social Media Text Mining Framework for Drug Abuse: Development and Validation Study With an Opioid Crisis Case Analysis

J Med Internet Res 2020;22(8):e18350

DOI: 10.2196/18350

PMID: 32788147

PMCID: 7446758

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.